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ABSTRACT 


Linear  attacks  against  cryptosystems  can  be  defeated  when  combiner  functions 
are  composed  of  highly  nonlinear  Boolean  functions.  The  highest  nonlinearity  Boolean 
functions,  or  bent  functions,  are  not  common — especially  when  they  have  many 
variables — bent  functions  are  difficult  to  find.  Understanding  what  properties  are 
common  to  bent  functions  will  help  ease  the  search  for  them. 

Using  the  SRC-6  reconfigurable  computer,  functions  can  be  generated  or  tested  at 
a  rate  much  higher  than  a  PC.  This  thesis  uses  the  SRC-6  to  characterize  data  for 
functions  with  4,  5  and  6  variables.  The  data  compiled  showed  trends  based  on  the  order, 
homogeneity,  balance,  and  symmetry  of  Boolean  functions.  The  transeunt  triangle  is 
used  to  convert  a  Boolean  function  into  Algebraic  Nonnal  Fonn,  so  that  the  properties 
are  easily  detennined.  The  first  known  proof  that  the  transeunt  triangle  correctly  converts 
between  the  two  Boolean  functions’  representations  is  included. 

The  SRC-6,  while  capable  of  pipelining  code  so  that  it  runs  up  to  six  thousand 
times  faster  than  a  PC,  is  limited  by  the  speed  of  the  FPGA,  100  MHz.  Functions  with  up 
to  six  variables  were  tested.  Predictions  on  this  data,  as  well  as  ways  to  improve  the 
computing  capability  of  the  SRC-6,  are  included. 
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EXECUTIVE  SUMMARY 


This  thesis  presents  an  analysis  of  Boolean  function  properties  with  a  focus  on 
bent  functions.  Bent  functions  are  Boolean  functions  with  the  highest  nonlinearity.  An 
examination  of  the  resulting  data  shows  trends  in  nonlinearity  based  on  degree, 
homogeneity,  and  certain  types  of  symmetry.  The  SRC-6,  a  recon  figurable  computer 
with  100  MHz  FPGAs,  is  used  in  this  thesis  as  a  parallel  computation  tool.  This  thesis 
shows  how  the  SRC-6  can  be  used  to  perform  tests  on  very  large  data  sets,  i.e.,  greater 
than  2  ,  in  a  fraction  of  the  time  it  would  take  a  PC.  Current  limitations  are  discussed  as 
well  as  ideas  for  future  improvement. 

Boolean  functions  are  used  as  components  in  many  cryptosystems  and  must  have 
properties  that  make  the  system  secure  against  some  known  attacks  (linear  and 
differential  cryptanalysis,  algebraic  attacks,  etc.).  Introduced  by  O.S.  Rothaus  in  the 
1960s,  bent  functions  have  as  large  a  distance  as  possible  from  the  set  of  affine  functions. 
This  property  is  a  major  factor  in  resisting  linear  and  other  code-breaking  techniques. 
Since  bent  functions  are  few  and  far  between,  it  is  a  challenge  to  find  them.  For  this 
thesis,  the  SRC-6  reconfigurable  computer,  a  resource  at  the  Naval  Postgraduate  School, 
enabled  searches  on  billions  of  functions  to  produce  those  of  interest. 

There  have  been  several  studies  involving  properties  of  bent  functions  and  the 
search  for  specific  groups  of  Boolean  functions  that  are  rich  in  bent  functions.  For 
example,  functions  of  certain  degrees  are  known  to  include  bent  functions,  while 
functions  of  other  degrees  are  known  to  exclude  bent  functions.  If  more  characteristics 
like  these  can  be  discovered,  bent  function  searches  can  become  less  time-consuming.  In 
this  thesis,  a  study  of  groups  of  functions,  such  as  homogeneous  functions,  rotation 
symmetric  functions,  dihedral  symmetric  functions  and  balanced  functions,  with  respect 
to  their  nonlinearity,  is  conducted.  No  bent  function  is  balanced,  but  it  is  important  to 
find  balanced  functions  with  high  nonlinearity.  The  results  of  tests  for  this  property  are 
also  included  in  this  thesis.  Further  research  can  lead  to  the  discovery  of  new  groups  of 
bent  and  highly  nonlinear  functions. 
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The  SRC-6  recon  figurable  computer  currently  uses  a  Xilinx  Virtex  II  Field 
Programmable  Gate  Array  that  runs  on  a  100  MHz  clock.  This  means  that,  if  a  program 
written  for  the  SRC-6  can  produce  and  test  one  function  per  clock  cycle,  a  maximum  of 
100  million  functions  can  be  tested  per  second.  This  speed  is  insufficient  for  testing  very 
large  groups  of  functions,  groups  greater  than  240,  especially  when  it  takes  more  than  one 
clock  cycle  to  test  a  function.  This  thesis  provides  several  ideas  on  how  to  increase  the 
throughput  of  a  program  run  on  the  SRC-6. 

A  function  can  be  written  as  a  truth  table  or  in  Algebraic  Normal  Form  (ANF). 
Both  forms  are  important  for  identifying  certain  properties  of  a  function.  The  ability  to 
convert  a  function  from  one  form  to  the  other  allows  a  function’s  properties  to  be  studied 
easily.  One  conversion  method  is  the  transeunt  triangle.  In  1986,  Green  introduced  the 
transeunt  triangle  but  did  not  prove  that  it  correctly  converted  a  truth  table  to  an  ANF. 
The  proof  included  in  this  thesis  is  the  first  known  proof  that  this  conversion  method 
holds.  The  implementation  of  the  transeunt  triangle  on  a  reconfigurable  computer  is  also 
a  unique  contribution.  As  far  as  we  know,  the  transeunt  triangle  has  never  before  been 
used  in  computation  (only  in  design).  Because  certain  properties  can  only  be  recognized 
in  one  fonn  or  the  other,  the  one-clock-cycle  conversion  method,  developed  in  this  thesis, 
allowed,  for  example,  a  function  of  degree  3  to  be  generated  in  Algebraic  Nonnal  Fonn, 
converted  to  a  truth  table  and  then  tested  for  balance.  This  important  test  cannot  be  done 
easily  without  this  simple  conversion  method. 

Future  work  is  described  including  the  search  for  more  groups  of  functions  with 
good  cryptographic  properties.  Bentness  is  not  the  only  desired  property  of  good 
cryptographic  functions.  Other  properties  include  balancedness,  strict  avalanche  criteria, 
propagation  criteria  and  correlation  immunity.  The  SRC-6  can  be  used  to  investigate 
these  other  properties. 
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I.  INTRODUCTION 


A.  OBJECTIVE 

The  importance  of  bent  functions  in  modem  cryptography  motivates  the  study 
done  for  this  master’s  thesis.  By  using  the  SRC-6  computer  available  at  the  Naval 
Postgraduate  School,  millions  of  Boolean  functions  were  generated  and  tested.  An 
algorithm  using  the  transeunt  triangle  has  never  been  implemented  in  this  magnitude. 
Data  was  broken  down  according  to  specific  properties  of  Boolean  functions,  including 
degree,  homogeneity,  and  symmetry.  Next,  these  groups  were  evaluated  for  relationships 
between  nonlinearity  and  specific  properties.  The  objective  is  to  find  groups  of  Boolean 
functions  that  may  be  rich  in  bent  functions  [1].  These  groups,  if  small  enough,  can  be 
tested  exhaustively  whereas  the  entire  set  of  functions,  even  for  small  numbers  of 
variables,  i.e.,  n= 6,  7,  8,  cannot  be  tested  in  a  reasonable  amount  of  time.  The  use  of  the 
transeunt  triangle  enables  functions  to  be  generated  easily  in  one  form,  converted  to 
another  form  and  then  tested  for  nonlinearity.  Without  the  transeunt  triangle  [2],  [3], 
important  groups  of  functions  could  not  be  tested  efficiently. 

B.  BACKGROUND 

Bent  functions  were  first  introduced  by  O.  S.  Rothaus  in  1976  [4].  His  original 
paper  had  restricted  circulation  for  about  10  years.  The  term  bent  was  probably  chosen  to 
mean  the  opposite  of  linear.  It  is  a  function  that  is  the  maximum  distance  away  from  the 
set  of  affine  functions.  Bent  functions  have  practical  applications  in  spread  spectrum 
communications,  cryptography  and  coding  theory  [5].  This  thesis  concentrates  on 
properties  of  bent  functions  as  they  apply  to  cryptography. 

The  Department  of  Defense  and  the  National  Security  Agency  are  increasingly 
interested  in  cryptographic  advances.  The  importance  of  code-breaking  in  World  War  II 
showed  that  secure  communications  is  necessary  for  Major  Combat  Operations.  Creating 
a  source  of  extremely  strong  cryptographic  components  will  ensure  that  the  Department 
of  Defense  can  communicate  securely  and  even  discourage  potential  adversaries  from 
action  if  they  believe  they  cannot  communicate  securely. 
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The  information  that  flows  across  the  Internet  must  also  be  secure.  The  Advanced 
Encryption  Standard  was  created  in  response  to  a  competition  initiated  by  NIST  (National 
Institute  of  Standards  and  Technology)  in  1998.  This  current  standard  uses  a  block 
cipher  involving  a  randomly  generated  key  combined  with  the  plaintext  message  in 
multiple  steps,  some  of  which  involving  substitution  boxes  (S-boxes)  with  high 
nonlinearity  characteristics.  There  are  several  types  of  stream  and  block  ciphers,  but  the 
encryption  part  of  the  cipher  is  where  the  bent  functions,  or  modifications  of  these,  may 
be  incorporated. 

Universities,  technical  businesses  and  government  agencies  are  doing  work  on 
Boolean  functions  [6],  [7],  [8].  Since  knowledge  of  these  functions  is  important  to  all 
countries  for  cryptography,  governments  put  emphasis  on  increasing  expertise  in  this 
field.  The  linear  attack  in  code-breaking  is  one  of  the  best  known,  but  the  use  of 
nonlinear  functions  will  counter  this  attack.  It  follows  that  the  more  research  completed 
on  secure  encryption  techniques;  the  easier  it  will  be  to  develop  counter  encryption 
techniques. 

Nonlinearity  in  functions  is  just  one  property  necessary  to  create  strong 
cryptographic  functions.  Research  is  also  performed  on  characteristics  like  propagation 
criteria,  strict  avalanche  criteria,  correlation  immunity  and  balance  [9].  Constructing  a 
truly  strong  cryptographic  function  requires  several  tests  and  trials.  Understanding  how 
to  construct  bent  functions  from  smaller  bent  functions  is  a  topic  of  increasing 
importance  [10],  This  capability  will  lessen  the  need  to  exhaustively  test  and  search  for 
them.  Creating  a  set  of  known  bent  functions  on  smaller  numbers  of  variables,  such  as 
n  <  8  ?  will  increase  the  number  of  possible  constructions  that  can  be  perfonned. 

C.  METHOD 

A  Boolean  function  is  a  series  of  ones  and  zeros  represented  by  a  specific  number 
of  variables  and  enumerated  by  assigning  a  Boolean  value  to  all  combinations  of  these 
variables.  A  Boolean  function  can  be  represented  by  1)  a  truth  table  (TT)  or  2)  Algebraic 
Normal  Form  (ANF).  A  very  simple  algorithm  for  converting  a  function  from  one  form 
to  the  other  exists  and  is  called  the  transeunt  triangle,  also  known  as  the  triangular 
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transform  method  [11].  It  involves  a  series  of  Exclusive-Or  operations.  In  an  ANF, 
properties  of  a  function,  like  its  degree  or  homogeneity,  are  easy  to  determine.  By 
examining  a  TT,  it  is  easy  to  detennine  whether  a  function  is  rotational  symmetric  or 
dihedral  symmetric.  The  algorithm  that  tests  for  nonlinearity  requires  the  TT  of  a 
function.  Using  the  transeunt  triangle,  it  is  simple  to  move  between  forms  to  find  specific 
properties  and  test  for  nonlinearity  and  possibly  other  cryptographic  features. 

The  SRC-6  computer  system  is  used  to  perform  computations  on  millions  of  test 
functions.  The  system  uses  Field  Programmable  Gate  Array  (FPGA)  technology  to  turn 
code  into  hardware  that  can  execute  faster  than  a  PC.  It  gives  the  programmer  more 
control  over  the  actual  design  of  the  circuit,  not  just  the  function  to  be  performed.  It  can 
also  use  a  type  of  parallel  programming  called  pipelining.  This  is  important  when  the 
circuit  is  so  large  that  it  has  large  delay.  A  circuit  that  uses  pipelining  can  process  more 
than  one  function  at  a  time,  dividing  the  process  into  steps  so  that  a  new  function  can  be 
sent  to  the  first  step,  while  the  previous  function  is  being  processed  in  the  second  step, 
etc.  The  ability  to  test  Boolean  functions  several  at  a  time  speeds  up  the  computation 
time.  Using  a  pipeline,  a  computation  can  be  produced  at  every  clock  cycle.  On  a  100 
MHz  FPGA  processor,  one  hundred  million  functions  can  be  tested  per  second.  This  is 
much  faster  than  a  modem  PC,  since  it  cannot  pipeline  its  tasks  to  the  same  degree  as  the 
SRC-6. 

D.  REL  ATED  WORK 

Research  on  bent  Boolean  functions  is  a  prominent  subject  in  the  cryptography 
community.  As  the  number  of  variables,  n,  increases,  the  length  of  the  function  (number 

of  truth  table  entries)  increases  by  2"  and  the  number  of  Boolean  functions  becomes  22 
making  exhaustive  testing  extremely  time  consuming.  This  is  why  finding  trends  in 
known  bent  functions  and  testing  functions  that  follow  these  trends  is  often  much  more 
successful  than  exhaustive  search  [12],  [13],  [14]. 

There  are  other  ways  of  finding  bent  functions,  such  as  binary  decision  trees  [15] 
and  genetic  algorithms  [16].  Bent  functions  can  also  be  constructed  from  smaller  bent 
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functions.  The  ultimate  goal  is  to  create  a  database  of  bent  functions,  or  find  a 
mathematical  description  or  construction  that  can  generate  them  all. 


E.  THESIS  OUTLINE 

The  outline  is  as  follows.  Chapter  I  is  the  introduction,  Chapter  II  is  an 
explanation  of  bent  functions,  Chapter  III  is  an  explanation  and  proof  of  the  transeunt 
triangle,  Chapter  IV  is  computation  and  analysis,  and  Chapter  V  provides  conclusions. 
Appendix  A  contains  code  for  the  SRC-6,  Appendix  B  contains  C  code,  and  Appendix  C 
contains  lists  of  functions  of  interest. 
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II.  AN  EXPLANATION  OF  BENT  BOOLEAN  FUNCTIONS 


UNDERSTANDING  BENT  FUNCTIONS 
1.  Definitions 

Let  Vn  be  the  vector  space  of  dimension  n  over  the  two-element  field  F2: 

Vn  =  {(xl,...,xj\xi  e  [0,1]} 

Definition  2.1.  A  truth  table  (TT)  is  the  output  table  of  the  Boolean  function, 
where  the  input  runs  through  the  entire  vector  space  in  lexigraphical  order. 

Definition  2.2.  The  Algebraic  Normal  Form  (ANF),  also  called  the  positive 
polarity  Reed-Muiier  Form,  of  a  function  f  is: 

f(xl,x2,...,xn )  =  fl0  ®  fljXj  ® ...®  anxn  ®  a[2x{x2  ® ...®  a„_hnxn_lxn  ® ...®  al  2  ^nxlx2...xn 
where  ai  takes  values  in  F2. 

Example  2.1.  The  truth  table  of  the  AND  of  two  variables  is: 


X2 

Xl 

/ 

0 

0 

0 

0 

1 

0 

1 

0 

0 

1 

1 

1 

The  ANF  of  this  function  is  f(xi,  x?)=  X1X2. 

Example  2.2.  The  truth  table  of  the  Exclusive-Or  ( © )  of  two  variables  is: 


X2 

Xl 

/ 

0 

0 

0 

0 

1 

1 

1 

0 

1 

1 

1 

0 

The  ANF  of  this  function  is  f(xi,  xi)=  xj  ®  X2. 

Definition  2.3.  A  term  is  the  AND  of  variables  or  their  complement. 
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Definition  2.4.  The  degree  of  a  function  f  is  the  largest  number  of  variables  in  a 
term  in  the  Algebraic  Normal  Form  of  f. 

Definition  2.5.  The  Hamming  distance  d(f,g)  between  two  functions  f  and  g  is 
the  number  of  places  where  their  truth  tables  do  not  have  the  same  value.  It  can 
also  be  interpreted  as  the  Hamming  distance  between  f  and  g  or  the  weight  of 
f  ©  g ,  that  is,  the  sum  of  the  ones  in  the  result  of  a  bit-wise  Exclusive-Or  of f  and 

g- 

Example  2.3. 

f(x3,X2,X]):  00011011 
g(x3,  X2,  xi):  11000101 
f®g:  11011110 

Distance  d(f,g):  6  (there  are  6  ‘ones’  in  the  sum) 

Definition  2.6.  A  linear  function  is  the  Exclusive-Or  of  single  variables.  Ex. 
f(xl,x2,x3,x4)  =  xl  ©x2  ©x4. 

Definition  2.7.  An  affine  function  is  a  linear  function  or  the  complement  of  a 
linear  function.  Ex.  f{xx ,  x2 ,  x3 ,  x4 )  =  xl  ©  x2  ©  x4  ©  I . 

Definition  2.8.  The  nonlinearity  of  a  function  f  is  the  minimum  Hamming 
distance  between  f  and  all  affine  functions.  Table  1  shows  an  example  where  the 
function  B  =  xyx2  ©  x,x4  is  tested  against  all  affine  functions  for  n=4.  This 
function ’s  nonlinearity  is  6. 
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Table  1.  The  Computation  of  the  Nonlinearity  of  B  =  x,x2  ©x3x4  (From  [17]) 

Definition  2.9  .  A  bent  function  is  a  Boolean  function  that  is  as  far  away  as 
possible  from  all  affine  functions,  i.e.,  it  has  the  largest  nonlinearity. 

Definition  2.10.  A  homogeneous  function  is  a  Boolean  function  whose  ANF 
have  terms  all  of  the  same  degree.  The  disjoint  quadratic  function  is  one  example: 

f  =  XjX2  ©  x3x4  © ...  ©  xn_1x„  When  n  is  a  positive  even  integer,  f  is  a  homogeneous 
function. 

Definition  2.11.  An  orbit  consists  of  terms  that  can  be  circularly  rotated  to  form 
other  terms  within  the  orbit  in  the  truth  table.  The  variables  in  one  term  are 
shifted  circularly  n  times  and  the  resulting  terms  have  the  same  truth  table  value. 

Example  2.4.  An  orbit  is  {x1x2,x2x3,x3x4,x1x4}.  The  function 
f(x j,x2,x3,x42)  =  x,x2  ©x2x3  ©x3x4  ©x,x4  contains  one  orbit  where  each  truth 
table  value  is  1. 

Definition  2.12.  A  Boolean  function  f  is  rotation  symme  trie  if  it  is  invariant 
under  all  cyclic  rotations  of  the  inputs.  Rotation  symmetric  functions  can  be 
divided  into  orbits  so  that  each  orbit  consists  of  all  cyclic  shifts  of  one  input  [14] . 
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Example  2.5.  Below,  we  give  the  truth  table  of  a  rotation  symmetric  function: 


X4 

X3 

X2 

Xi 

J_ 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

1 

1 

0 

0 

1 

0 

0 

0 

0 

1 

0 

1 

0 

0 

1 

1 

0 

0 

0 

1 

1 

1 

1 

1 

0 

0 

0 

0 

1 

0 

0 

1 

0 

1 

0 

1 

0 

0 

1 

0 

1 

1 

1 

1 

1 

0 

0 

0 

1 

1 

0 

1 

1 

1 

1 

1 

0 

1 

1 

1 

1 

1 

1 

The  ANF  of  this  function  is  f  =  XjX2x3  ©  xxx2x4  ©  XjX3x4  ©  x2x3x4  ©  XjX2x3x4  Note 
that  the  algebraic  expression  is  unchanged  by  the  permutation 
X\  ^  X2  ^  X3  ^  X4  ->X! 


Definition  2.13.  A  dihedral  symmetric  function  is  a  rotation  symmetric  function 
that  also  has  the  reflection  property.  If  a  function  f  has  orbits  that,  when  their 
variables  are  flipped  become  equivalent  to  other  orbits  and  the  function  values  of 
every  term  in  both  orbits  are  equivalent,  thenf  is  dihedral  symmetric.  Simply  put, 
f  must  satisfy  f(xi,  X2,..,x„)=f(xn,  ...,X2,xj)  in  addition  to  the  rotation  symmetry.  For 
n=4  all  rotation  symmetric  functions  are  also  dihedral  symmetric,  since  the 
reflection  of  each  orbit  gives  the  same  orbit. 
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Example  2.6.  Two  dihedral  symmetric  orbits  for  n=6  are: 

Orbit  1 _ Orbit  2 _ 

00101 1  (xix2x4)  1 10100  (x3x5x6) 

010110  (X2X3X5)  011010  (X2X4X5) 

101100  (x3x4x6)  001101  (xix3x4) 

110010  (x2x5x6)  010011  (X1X2X5) 

100101  (xix3x6)  101001  (xix4x6) 

In  a  dihedral  symmetric  function,  all  truth  table  values  for  the  above  places  will 

be  the  same. 

Definition  2.14.  The  number  of  variables  in  a  function  is  referred  to  as  n  in  this 

paper.  If  n=4,  the  variables  are  listed  as  x4x3x2x1  and  the  function  length  is  2n 

bits.  There  are  2 2  possible  functions  on  n  variables. 

2.  Known  Characteristics 

There  have  been  numerous  studies  on  bent  Boolean  functions,  both  theoretical 
and  computational.  There  are  several  important  characteristics  already  known  about  bent 
functions.  The  exact  number  of  bent  functions  is  known  only  for  n  <  8 ,  as  shown  in 
Table  2.  The  following  are  some  important  theorems  and  lemmas  related  to  bent 
functions.  They  help  narrow  the  field  of  functions  to  search  by  eliminating  groups  of 
functions  known  not  to  be  bent.  Bent  functions  are  only  found  in  the  set  of  Boolean 
functions  on  even  number  of  variables  (n  even). 


n 

Number  of  Bent  Functions 

4 

896 

6 

5,425,430,528 

8 

9.9xl031 

Table  2.  Number  of  Bent  Functions  on  n  Variables 
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Theorem  2. 1.  There  are  no  homogeneous  bent  functions  of  degree  m  on  2m 
variables  for  m>3.  [13] 

Theorem  2. 2.  The  maximum  nonlinearity  for  an  n-variable  function  when  n  is 
even  is  2rrl-2"  ^1 .  For  odd  n>=  7,  the  maximum  nonlinearity  is  unknown.  [1] 

Lemma  2.1.  A  bent  function  Exclusive-Or’d  with  an  affine  function  is  also  a  bent 
function.  That  is,  iff  is  bent  and  g  is  affine,  then  f  ®g  is  also  bent.  [4] 

Lemma  2.2.  Let f  be  a  ben, function,  then  2  <  order(f)  <  1  Jon  n  >  4  [4] 

3.  Limita  tions 

Exhaustive  search  of  bent  functions  is  limited  by  the  very  large  number  of 
functions  over  which  a  search  is  conducted.  Since  bent  functions  on  8  variables  or  less 
are  already  known,  concentration  on  longer  functions  is  the  next  step.  For  example,  the 
TT  of  a  function  of  10  variables  is  1024  bits  long.  While  this  is  an  appropriate  length  for 
a  cryptographic  component,  most  computers  can  only  store  64  bits  in  a  register.  In  an 
FPGA,  any  size  registers  can  be  created.  The  program  is  then  limited  by  the  speed  of  the 
FPGA.  The  number  of  functions  to  be  tested  increases  exponentially  so  even  a  pipelined 
code  that  can  compute  one  function  per  clock  cycle  will  take  years  to  run.  The  faster 
computer  technology  becomes,  the  easier  it  will  be  to  test  larger  groups  of  functions. 

B.  NOTATION  AND  COMPUTING  NONLINEARITY 

Let  n  be  the  number  of  variables  in  the  function  set  Fn,  where  f(xi,  xj,  ....  x„) 

represents  one  of  22  functions  as  a  truth  table.  Let  A„  be  the  set  of  2n+1  affine  functions 
where  a(xi,  x?,  ....  xn)  is  one  affine  function  as  a  truth  table.  The  distance  between  one 
function  in  Fn  and  one  function  in  An  is  the  number  of  function  values  that  are  different 
between  the  two.  This  is  found  by  performing  the  operation  /  ©  a  and  counting  the 
number  of  ones  in  the  result.  Dj  =  wt(  f  ©a)  where  wt  represents  the  Hamming  distance 
between  /  and  a.  To  find  a  bent  function,  the  distance  between  a  perspective  bent 
function  and  each  affine  function  must  be  computed  and  compared  to  2"  1  -2'"  2'  '.  The 
minimum  distance  among  the  result  is  called  the  nonlinearity  of  the  function, 
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NLf  =min(D0,Dl,...,Dr+i) .  After  every  function  on  n  variables  is  tested  and  its 

nonlinearity  is  found,  the  functions  with  the  highest  nonlinearity  are  called  bent  when  n  is 
even.  The  nonlinearity  of  a  bent  function  is  NLB  =  max(AI/  ,NLf  ,...,NLf  ) .  In  other 

22”-l 

words,  a  function  f(xi,  X2,  ...,xn)  is  bent  if  its  nonlinearity,  NLB,  is  2"~1-2"/2~I . 

Bent  functions  can  be  separated  into  classes.  One  class  is  the  A-class  [1],  where 
if  h  -  f  ©  g  and  g  is  an  affine  function,  then /  and  h  belong  to  the  same  A-class.  Since 
affine  functions  have  degree  1  or  0,  functions  with  terms  of  these  degrees  do  not  need  to 
be  searched.  If  a  bent  function  is  found  with  terms  of  degree  2  and  3,  then  2n+I  more  bent 
functions  can  be  found  by  performing  an  Exclusive-Or  operation  with  the  bent  function 
and  each  affine  function.  This  will  result  in  all  functions  of  the  same  A-class.  This 
property  aids  in  the  search  for  bent  functions  by  reducing  the  search  area. 


C.  CIRCUIT  ANALYSIS 

A  simple  circuit  can  be  created  to  compute  the  nonlinearity  of  a  function.  The 
block  fonn  is  shown  in  Figure  1. 


Figure  1.  Process  for  Computing  Nonlinearity  of  a  Function  (From  [17]) 

This  circuit  is  built  using  the  Verilog  programming  language.  The  input  is  one  2n- 

bit  function  and  the  output  is  an  n+1- bit  nonlinearity.  Two  sets  of  code  were  used  in 

computations.  The  original  code  proved  correct,  but  inefficiently  designed  and  for  large 
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n,  the  design  did  not  meet  compiler  specifications.  This  code  is  included  in  Appendix 
A.2.  The  second  version  of  this  code  was  written  by  another  student  [16]  and  had  a  more 
efficient  design.  This  code  was  modified  and  is  included  in  Appendix  A.  1.6.  The 
included  code  is  designed  for  6-variable  functions,  but  can  be  easily  adapted  to  test 
functions  of  other  values  of  n. 

Based  on  this  circuit,  the  number  of  functions  that  can  be  tested  per  second  is 
based  on  the  100  MHz  speed  of  the  FPGA.  Table  3  shows  the  time  required  for  small  n. 
Since  an  exhaustive  search  for  n>  6  is  not  feasible,  the  next  step  is  to  test  only  some 
subsets  of  all  functions.  If  certain  subsets  are  found  to  be  rich  in  bent  functions,  the  time 
required  to  search  these  smaller  sets  of  functions  becomes  more  practical. 


n 

#  of 

Functions 

Comp.  Time  - 
All  Functions 
100  MHz. 

2 

16 

0.1 6  fisec. 

3 

256 

2.56  jiisec. 

4 

65,536 

655.4  (isec. 

5 

4.2950  x  109 

42.9  sec. 

6 

1.8447  x  1019 

5,849  yrs. 

7 

3.4028  x  1038 

1.1  x  1023yrs. 

8 

1.1579  x  1077 

3.7  x  1 061  yrs. 

9 

1.3408  x  1015' 

4.3  x  10138yrs. 

Table  3.  Time  to  Compute  Nonlinearity  of  Listed  Number  of  Functions  (From  [17]) 

Studying  several  properties  of  Boolean  functions  and  choosing  those  properties 
that  reduce  the  function  set  is  the  best  way  to  begin  the  search.  The  next  challenge  is 
learning  how  to  generate  a  specific  set  of  Boolean  functions.  Functions  with  properties 
like  rotation  symmetry  and  dihedral  symmetry  can  be  generated  by  creating  a  mapper  to 
specific  truth  table  values  and  then  inputting  a  counter  into  the  mapper.  This  will  ensure 
that  certain  sets  of  terms  always  have  the  same  value.  Other  properties,  like  degree  and 
homogeneity,  cannot  be  easily  determined  from  examining  a  truth  table.  The  function 
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must  be  in  Algebraic  Normal  Form.  One  way  to  transform  a  function  from  a  truth  table 
to  Algebraic  Normal  Fonn  is  to  use  the  transeunt  triangle.  This  conversion  capability 
allows  a  set  of  functions  to  be  created  with  any  property,  converted  easily  between 
function  forms  and  tested  for  nonlinearity  or  other  properties.  The  transeunt  triangle  is 
discussed  in  the  next  chapter. 
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III.  THE  TRANSEUNT  TRIANGLE 


The  transeunt  triangle  is  a  data  structure  proposed  by  V.  Suprun  [18]  to  derive  the 
minimum  mixed-polarity  Reed-Muller  canonical  expression  for  a  given  symmetric 
function.  About  10  years  earlier,  D.  Green  [11]  observed  that  the  transeunt  triangle  could 
be  used  to  derive  the  positive  polarity  Reed-Muller  canonical  expression  (ANF)  from  the 
truth  table  of  a  general  (not  necessarily  symmetric)  function.  Neither  Green  nor  Suprun 
proved  their  observations.  Recently,  Butler,  Dueck,  Yanushkevich,  and  Shmerko  [19] 
proved  Suprun's  observation.  Green's  observation  is  proven  in  this  thesis. 

Another  contribution  of  this  thesis  is  to  demonstrate  that  the  transeunt  triangle  has 
significant  benefit  in  computational  applications.  That  is,  it  is  shown  that  one  can  quickly 
compute  the  ANF  of  a  function/ from  the  truth  table  of /  Conversely,  given  the  ANF  of 
a  function/,  the  truth  table  of/ can  be  quickly  computed.  This  is  important  for  two 
reasons: 

1 .  Properties,  like  homogeneity  and  A-class  membership,  are  easily  computed 
from  the  ANF,  but  not  from  the  truth  table. 

2.  Properties,  like  nonlinearity  and  balancedness,  are  easily  computed  from  the 
truth  table,  but  not  from  the  ANF. 

The  code  instantiating  the  transeunt  triangle  was  initially  written  in  Verilog  by  Dr. 
J.  T.  Butler  utilizing  n  2n- bit  registers  in  a  pipelined  series  of  Exclusive-Or  operations. 
The  output  is  calculated  in  one  clock  cycle.  This  code  could  only  compile  on  the  SRC-6 
for  n  <  8  due  to  memory  constraints.  The  code  was  modified  in  this  thesis  to  work  for 
n  -  9  and  then  further  modified  to  work  for  larger  n.  The  code  is  listed  in  Appendix  A. 3 
and  A. 4. 

A.  CONVERTING  BET  WEEN  A  TRUTH  TAB  LE  AND  AL  GEBRAIC 

NORMAL  FORM 

1.  Expanding  a  Boolean  Function  Given  as  a  Truth  Table 

A  Boolean  function  is  created  using  a  series  of  n  variables  where  a  value,  either  1 
or  0,  is  assigned  to  each  of  the  2n  combinations  of  variables.  The  resulting  truth  table  can 
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be  written  as  a  function  by  using  the  unique  variable  combination  as  an  index  to  the 
assigned  Boolean  value.  The  value  becomes  a  coefficient  for  the  tenn  represented  by  the 
logical  AND  of  the  variable  combination. 


For  n=3,  the  truth  table  is 


X3 

X2 

Xl 

/ 

0 

0 

0 

do 

0 

0 

1 

di 

0 

1 

0 

d2 

0 

1 

1 

d3 

1 

0 

0 

d4 

1 

0 

1 

d5 

1 

1 

0 

d6 

1 

1 

1 

d7 

The  coefficients  are  do  through  The  expansion  is: 

/ (x3  v2  Vj)  =  d  0x3x2x{  +  d  1x3x2xl  +  d  2x3x2xx  +  d  3x3x2xx  +  d 4x3x2xx 
+  d5x3x2x  j  +  d6x3x2x  j  +  d-jX  3x2xx  (1) 


Any  term  can  be  formed  by  taking  the  «-bit  binary  value  of  the  index,  and,  if  the 
bit  is  a  1  the  corresponding  variable  is  included  in  the  tenn,  if  the  bit  is  a  0  the 
corresponding  variable  is  complemented  and  included  in  the  tenn. 

Example  3.1.  The  term  in  4-variables  with  coefficient  dg  corresponds  to  the 
binary  value  xyc  3X2X1= 100 1(9)  so  the  term  formed  is  dgx4x3x2x  x. 


2.  Expanding  a  Boolean  Function  Given  in  Algebraic  Normal  Form 

The  Algebraic  Normal  Form  (ANF)  uses  the  Exclusive-Or  operator  to  combine 
tenns  in  a  function.  The  function  string  is  not  the  same  as  the  truth  table  string.  The 
formal  expression  is  f(xn,xn_l,...,xl)  =  c0  ®  I  CjXj  ©  clJxlx]  © ...  ©  cnXxy2...xn .  For 

1  <i<n  1  <i<j<n 

n= 3  the  function  coefficients  are  Co  through  C7.  The  expansion  is: 

/ (v3  X2Xx )  =  c0  ®  CjVj  ©  C2X2  ©  C3X2X \  ©  C4X3  ©  C5X3X J  ©  C6X3X2  ©  C7X3X2X J  (2) 
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In  this  case,  any  term  can  be  formed  by  examining  the  binary  value  of  the  index, 
and  if  the  bit  is  a  1,  the  corresponding  variable  is  included  in  the  term,  if  the  bit  is  a  0,  the 
corresponding  variable  is  not  included  in  the  term.  If  Co=l,  the  term  is  just  1. 

Example  3.2.  The  term  in  4-variables  with  coefficient  eg  corresponds  to 
X4X3X2Xi=1001(9)  so  the  term  is  c 9X4X1. 


B.  TRANSEUNT  TRIANGLE  STRUCTURE,  USE  AND  PROOF 
1.  Definition  and  Structure 


Definition3.1.  The  transeunt  triangle  is  a  series  of  Exclusive-Or  operations  with 
an  input  of  2"  coefficients  along  the  base  and  an  output  of  2n coefficients  along  the 
left  side.  The  triangle  is  formed  by  performing  an  Exclusive-Or  operation  with 
every  two  consecutive  values  on  one  row  and  placing  the  result  in  the  next  higher 
row  between  the  two  values  with  which  the  operation  was  performed. 


Figure  2  shows  the  operations  inside  a  transeunt  triangle  for  n= 3. 
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O  — «  — < 


Figure  2.  Transeunt  Triangle  for  n= 3  (After  [11]) 


In  Figure  2,  the  coefficients  (d0  to  d?)  that  form  the  bottom  row  of  the  triangle  are 
taken  from  the  truth  table  of  a  3-variable  function.  Each  succeeding  operation  shows  the 
coefficients  that  are  included  in  the  Exclusive-Or  operation.  0  ©  1  denotes  d()  ©  d] .  Any 
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value  that  is  included  twice  is  cancelled  out.  This  follows  from  xt  ©  xt  =  0,  and 
0  ©  g  -  g  .  The  values  on  the  left  side  of  the  triangle  when  all  operations  are  complete 
are  then  placed  in  the  Algebraic  Nonnal  Form  of  the  function  as  coefficients  C;  as  in  (2). 

The  transeunt  triangle  creates  a  bijective  relationship  between  the  ANF  and  the 
truth  table  of  a  Boolean  function/  i.e.,  it  has  a  1  -to- 1  correspondence.  If  T(S)  is  the 
transeunt  triangle  of  the  Boolean  string  S,  t  is  the  truth  table  string  and  a  is  the  ANF 
string,  then  T(t)=a  and  T(a)=t. 

2.  Examples 

Example  3.3:  The  3-variable  truth  table  of  f  is 


X3 

X2 

Xl 

/ 

0 

0 

0 

1 

0 

0 

1 

1 

0 

1 

0 

1 

0 

1 

1 

1 

1 

0 

0 

1 

1 

0 

1 

1 

1 

1 

0 

1 

1 

1 

1 

0 

When f  is  placed  along  the  bottom  row  of  a  transeunt  triangle,  and  the  operations 
are  computed,  the  triangle  becomes  the  one  shown  in  Figure  3.  The  result  can  be 
read  on  the  left  side  of  the  triangle  from  bottom  to  top  where  coefficients  Co  and  c- 
are  Is  and  all  other  coefficients  are  Os.  The  ANF  expansion  is  1  ©  x,x2x, . 
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1 

0  1 
0  0  1 

0  0  0  1 
0  0  0  0  1 

0  0  0  0  0  1 
0  0  0  0  0  0  1 

11111110 

Figure  3 .  Transeunt  Triangle  for  /(x3 ,  x2 ,  Xj )  =  1  ®  x,x2x3 

(End  of  Example) 

Example  3.4:  If  the  ANF  coefficients  are  placed  along  the  bottom  row,  the  result 
is  the  truth  table  appears  along  the  left  side.  If  the  ANF  is  x,  ©  x2  ©  x3 .  The 
triangle  is  shown  in  Figure  4. 

1 

0  1 
0  0  1 
1110 
0  10  11 
110  0  10 
10  1110  0 
0  110  10  0  0 

Figure  4.  Transeunt  Triangle  for  /(x3 ,  x2 ,  x, )  =  x,  ©  x2  ©  x3 
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The  truth  table  is  the  left  side  of  the  triangle  from  bottom  to  top  shown  here: 


X3 

X2 

Xl 

/ 

0 

0 

0 

0 

0 

0 

1 

1 

0 

1 

0 

1 

0 

1 

1 

0 

1 

0 

0 

1 

1 

0 

1 

0 

1 

1 

0 

0 

1 

1 

1 

1 

(End  of  Example) 

3.  Proof  that  the  Transeunt  Triangle  Converts  between  the  ANF  and  the 
Truth  Table  of  a  Boolean  Function/ 

Theorem  3.1:  The  transeunt  triangle  converts  the  truth  table  of  an  n-variable 
function  f  into  the  Algebraic  Normal  Form  of  f. 

Proof:  (by  induction) 

First,  we  show  the  hypothesis  is  true  for  n—  1. 

Using  the  form  f(xl)  =  d0x]  +d{x l  four  possible  functions  on  one  variable  are: 

d0  d1  f 

0  0  0 
0  1  Xj 

1  0  1 
1  1  x. 


There  are  four  possible  transeunt  triangles  that  can  be  made  for  n=  \ .  They  are 


Cj  0  c1  1  c1  1 

c0  0  0  c0  0  1  c0  1  0 

d^  :  l '  d0  :  l '  d0  dy  y 


Cj  0 

c0l  1 

cIq  1 
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The  new  ANF  expression  for  each  triangle  is  f(xx)  =  c0  ©  cxxx ,  where  c0  =  d0  and 
cx  =  d0@dx. 

To  show  that  the  left  nodes  of  each  triangle  are  indeed  the  coefficients  of  the  ANF 
of  each  function,  observe  the  following: 

The  equation  of /  using  its  truth  table  coefficients  (Shannon  decomposition) : 
f(xx)  =  d0xx+dxxx  (3) 

Replacing  +  by  ©  preserves  the  equality: 
f(xx)  =  d0x j  ©  dxxx 
Identity  used  to  replace  x, 
a  ©  1  =  a  identity 

Using  distributive  and  associative  laws,/  becomes 
/(Xj )  =  d0  (Xj  ©  1)  ©  dxxx 
=  d0x j  ©  d0  ©  dxx j 
=  d0®  (d0  ©  dx)x j 

/  is  now  in  Algebraic  Normal  Fonn: 

/(x1)  =  c0©c1x1  (4) 

The  ANF  coefficients  are: 

co  —  do 
cx  =  d0®  dx 

Next,  we  assume  the  hypothesis  is  true  for  n=k.  We  prove  that  this  implies  it  is 
true  for  n=k+l.  The  triangle  for  k+1  can  be  broken  down  into  three  triangles  of  size  k  as 
shown  in  Figure  5.  Along  the  bottom  of  the  lower  triangles  are  the  coefficients  of  f, 

where  xk=0  on  the  left  and  xk=l  on  the  right  using  the  notation  /( 0  xk  )xk  and 
/( 1  — »  xk)xk  ,  respectively,  following  the  form  of  (3)  where  k=l.  We  have  assumed  the 
triangle  is  correct  for  k,  so  the  nodes  on  the  left  side  of  each  lower  triangle  are  the 
coefficients  for  the  ANF  represented  by  the  functions  /(()  — >  xk )  and  /'  (l  — >  x/  for  the 

left  and  right  triangles,  respectively.  This  follows  the  form  of  (4)  where  c0  =  fL(0  — »  xk) . 


21 


Figure  5.  The  composition  of  k- sized  triangles  to  form  a  k+1  sized  triangle 

We  show  that  the  bottom  line  of  the  upper  triangle  can  be  represented  by  the 
function  f(0^xk)  ®  f(l—>xk),  as  follows.  The  number  of  paths  in  a  tree-like 

triangle  with  2a+l  rows  from  a  point  on  the  bottom  row  to  the  root  is  determined  by 
C(2a,i)  where  C(m,  p)  is  the  number  of  ways  to  choose  p  objects  from  m  objects  without 
repetition.  2a  is  the  number  of  hops  required  to  get  to  the  top  of  the  triangle  (a=k  in  the 
proof  for  k+1 )  and  i  is  the  index  for  the  base  of  this  new  triangle  from  0  to  2a .  For  i=0  or 
i=2a,  the  number  of  paths  is  1,  since  C(2a,  0)=C(2a,  2a)=l. 

Theorem  3.2.  C(2a,  i)  mod  2  =0  for  0<i<2a.  (Special  use  of  Lucas’  Theorem, 
i'C(pa,  i)  =  pa-C(pa  -  1,  i  - 1),  where  p  is  prime).  [20] 

From  Theorem  3.2,  for  all  other  0  <  i  <  2a ,  the  number  of  possible  paths  from  the 
base  to  the  root  is  even.  Because  this  number  is  even,  the  inner  coefficients  traveling 
through  the  triangle  will  ultimately  be  cancelled  out. 

The  result  along  the  base  of  the  upper  triangle  is,  therefore, 
S'  =  /( 0  — »  xA. )  © /(I  — »  xk )  ,  as  shown  in  Figure  5 .  It  follows  that  the  upper  triangle  (of 

size  k)  produces  a  left  side  (or  ANF)  of  (/(0— »jct)©  /(I  — »  xk))Lxk ,  where  ci  in  (4)  is 
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(/(O  — »  xk) © /(I  — >  xk))L  .  The  ANF  string  for  the  k+1  triangle  is  then  the  Exclusive-Or 
of  the  left  sides  of  the  lower  left  and  upper  triangles: 
/( 0^xk)L®  ((/(0  ->**)©/(  1  ->  x,  ))L  xk . 

Q.E.D. 

Figure  6  shows  a  triangle  for  n= 3,  where  4  triangles  can  be  formed  so  that  the  root 
of  each  triangle  makes  up  row  5.  The  colored  paths  shown  are  the  most  direct  paths  from 
each  base  node  to  the  corresponding  root.  In  the  figure,  each  coefficient  along  the  base  of 
the  outer  triangle  is  included  exactly  once  at  the  root  with  a  direct  path  to  it  and  is  not 
included  at  any  other  roots.  The  labels  a,  b,  c,  and  d  in  the  figure  correspond  to  the 
Exclusive-Or  of  two  coefficients  a  =  d0  ©  d4,  b  =  dl®d5,  c  =  d2®d6,  d  =  d2®  d7 .  This 

is  the  same  as  /(0  — »  xk)  ©  /(I  — >  xk) . 


Figure  6.  Four  triangles  are  formed  showing  only  one  path  from  the  corner  of  each 

triangle  to  the  top. 

To  further  show  that  each  coefficient  appears  in  f(0—>xk)  ©  /( l—>xk)  only 
once,  consider  the  following.  The  inner  indices  of  each  triangle  in  Figure  6  will  travel 
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through  Exclusive-Or  operators  to  the  root  of  that  triangle  following  an  even  number  of 
possible  paths.  This  means  that  each  coefficient  value  will  be  represented  at  the  root  an 
even  number  of  times  and  will  therefore  cancel  out  leaving  only  the  coefficients  on  the 
outside  edge  of  each  triangle. 

For  coefficient  di  in  Figure  6,  there  are  C(2  ,1)=4  paths  di  will  be  included  in  to 
get  to  node  a.  Node  a  includes  4  terms  of  di,  but  since  di  ©  di=0,  all  di  terms  will  cancel. 
Only  terms  with  an  odd  number  of  paths,  do  and  d4  will  remain  at  node  a. 

C.  PROPERTIES  OF  BOOLEAN  FUNCTIONS 

1.  Degree 

The  degree  of  a  function  refers  to  the  number  of  variables  in  the  term  with  the 
most  variables  in  Algebraic  Normal  Fonn.  For  example,  the  function 
/  =  xtx2x3  ©  XjX3  ©  x2  has  degree  3,  since  the  term  with  the  most  variables  has  three 
variables.  Lemma  2.2  states  that  bent  functions  of  degree  n/2  for  even  n  exist.  For 
example,  the  group  of  all  6-variable  functions  contains  bent  functions  of  degree  2  and  3 
only.  Affine  functions  are  functions  of  degree  1  and  0,  and  so  these  functions  never  need 
to  be  tested  for  bentness.  Lemma  2.2  also  states  that  there  are  no  bent  functions  with 
degree  greater  than  n/2  for  n  >  4 ,  therefore,  functions  with  degree  greater  than  n/2  do  not 
need  to  be  tested.  This  reduces  the  set  of  testable  functions  considerably;  however,  for 
larger  n,  the  test  set  is  still  overwhelming. 

Figure  7  shows  the  number  of  functions  for  n=8  for  each  degree.  There  are 
1.16920130986472xl049  functions  of  degree  4  for  n=8.  This  is  only  lxl0'28%  of  all 
functions  on  n=8.  Despite  the  massive  reduction  in  the  test  set,  the  code  to  detennine  the 
nonlinearity  of  each  function  would  take  3.7x10  ~  years  to  test  on  the  SRC-6  at  one 
function  per  clock  cycle. 
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Number  of  8-Variable  Functions  by 
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z  1.00E+09 
1.00E+00 
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012345678 


Degree 


Figure  7.  Number  of  Functions  on  8  Variables  by  Degree 


Code  was  written  to  both  test  a  function  for  its  degree  and  to  generate  functions 
with  specific  degree.  Both  sets  of  code  were  useful.  When  generating  functions  in  truth 
table  form,  the  degree  was  found  by  converting  the  function  to  ANF  using  the  transeunt 
triangle  and  then  testing  it  to  determine  the  degree.  Generating  functions  with  a  specific 
degree  in  ANF  created  a  new  test  set.  The  transeunt  triangle  was  then  used  to  convert  the 
function  to  a  truth  table  and,  in  this  form,  find  its  nonlinearity.  The  code  used  to 
determine  the  degree  of  a  function  within  a  subroutine  is  shown  in  Appendix  A. 7.2.  The 
code  used  to  generate  a  function  with  degree  d  is  shown  in  Appendix  A. 6. 

2.  Homogeneity 

A  function  in  which  all  terms  have  the  same  degree  is  called  homogeneous. 
Homogeneous  functions  of  order  1  or  0  and  their  complements  are  the  affine  functions. 
Homogeneous  functions  represent  an  even  smaller  test  group  than  functions  of  specific 
degree,  because  they  are  a  subset  of  these  functions.  Figure  8  shows  the  distribution  for 
homogeneous  functions  on  n=8.  From  Theorem  2.1,  there  are  no  homogeneous  bent 
functions  of  degree  m  on  2m  variables  where  m>3. 


25 


Homogeneous  Functions  on  8 
Variables  by  Degree 


1.18E+21 


Degree 

Figure  8.  Number  of  Homogeneous  Functions  on  8  Variables  by  Degree 

Continuing  with  the  example  for  n=8,  the  number  of  homogeneous  functions  of 
degree  4,  meaning  all  terms  are  4  variables,  is  1.18059162071741xl021.  This  set  of 
functions  can  be  excluded  from  testing  since  Theorem  2.1  states  that  no  bent  functions 

1  T 

exist  in  this  group.  For  comparison,  it  would  take  3.7x10  years  to  compute  the 
nonlinearity  of  all  functions.  This  is  10“  times  faster  than  the  computation  for  all 
functions  of  degree  4.  The  homogenous  functions  of  degree  3  would  take  22  years  to 
compute;  however,  it  is  possible  to  test  functions  of  degree  2  since  this  would  take  less 
than  3  seconds.  As  n  increases,  the  time  required  to  test  a  function  is  exponential  in  n. 

Along  with  code  written  for  determining  the  degree  of  a  function,  code  to 
determine  if  a  function  was  homogenous  was  also  written.  This  was  done  in  both  C  and 
Verilog.  It  was  determined  that  running  the  code  in  the  subroutine  as  C  code  was  much 
faster  than  calling  a  Verilog  module  as  a  macro.  It  is  simple  to  generate  a  mapper  to 
create  homogenous  functions  of  a  specific  degree.  A  separate  mapper  must  be  generated 
for  each  n  and  each  degree.  Another  way  to  generate  homogenous  functions  using  n  and 
the  degree  as  an  input  was  used,  but  proved  slower  when  computing.  Code  used  to  test 
for  homogeneity  and  to  generate  homogeneous  functions  is  included  in  Appendix  A. 6. 
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3.  Rotation  Symmetric 

Functions  whose  value  is  unchanged  when  the  variables  in  the  function  are  rotated 
circularly  to  each  position  are  called  rotation  symmetric  (See  Definition  2.12).  These 
functions  have  been  tested  for  bentness  as  they  are  another  small  group.  Stanica  and 
Maitra  [21]  conjectured  that  there  are  no  homogeneous  rotation  symmetric  bent  functions 
of  degree  greater  than  two.  This  is  not  proven  but  can  be  tested  exhaustively  for  n  <  8. 
It  cannot  be  tested  exhaustively  for  n=10,  since  there  are  3.24x10  "  rotation  symmetric 
functions  requiring  1017  years  to  test.  For  testable  group  sizes,  rotation  symmetric  bent 
functions  are  found.  Table  4  shows  the  number  of  rotation  symmetric  bent  functions  for 
n=4,  5,  and  6. 


n 

Number  of  Rotation  Symmetric  Bent  Functions 

Total  Bent  Functions 

4 

8 

896 

5 

36 

27,387,136 

6 

48 

5,425,430,528 

Table  4.  Number  of  Rotation  Symmetric  Bent  Functions 


To  generate  rotation  symmetric  functions,  a  mapper  must  be  created  for  each  n.  A 
bit  is  assigned  to  each  term  of  the  truth  table  of  the  function  so  that  rotation  symmetric 
tenns  have  the  same  value.  These  assign  statements  were  generated  in  a  C  program  used 
to  determine  which  sets  of  terms  were  rotation  symmetric.  This  code  is  included  in 
Appendix  B.l.  Once  the  function  was  formed,  the  nonlinearity  could  be  determined.  For 
n=6,  the  distribution  of  nonlinearities  of  rotation  symmetric  functions  is  shown  in  Figure 
9. 
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Figure  9.  Nonlinearity  Distribution  for  Rotation  Symmetric  Functions  on  6  Variables 


28 


To  determine  the  degree  or  homogeneity  of  the  function,  the  function  was 
converted  to  ANF  using  the  transeunt  triangle  and  the  result  was  tested  for  these 
properties.  Figure  10  shows  the  rotation  symmetric  functions  on  6  variables  distributed 
by  degree  and  homogeneity.  The  number  above  the  bar  is  the  number  of  functions  with 
the  corresponding  degree  and  the  number  below  is  the  number  of  homogeneous  functions 
for  that  degree. 
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Distribution  of  Rotation  Symmetric  Functions  on 
6  Variables  by  Degree  and  Homogeneity 
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Figure  10.  The  Distribution  by  Degree  and  Homogeneity  of  Rotation  Symmetric 
Functions  on  6  Variables  (Degree:  Upper  Number  and  Blue  Bar.  Homogeneity: 

Lower  Number  and  Red  Bar.) 


4.  Dihedral  Symmetric 

Rotation  symmetric  functions  that  contain  dihedral  orbits  with  the  same  function 
values  are  called  dihedral  symmetric  (See  Definition  2.13). 


Example  3.6:  For  n=6,  there  are  14  orbits  where  rotating  a  combination  of  the  6 
variables  will  produce  another  combination  of  variables  in  the  same  orbit.  Below 
are  two  orbits. 


Orbit  1 :  {xtx2x4 ,  x2x3x5 ,  x3x4x6 ,  x2x5x6 ,  XjX3x6 } . 


Orbit  2:  {x3x5x6 ,  x2x4x5 ,  XjX3x4 ,  XjX2x5 ,  XjX4x6 } . 
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For  rotation  symmetric  functions,  each  combination  of  variables  in  the  same  orbit 
must  be  assigned  the  same  function  value.  A  function  is  dihedral  symmetric  if,  in 
addition  to  rotation  symmetry,  it  contains  one  or  more  orbits,  such  that,  when  flipped, 
another  orbit  is  produced  and  all  terms  in  both  orbits  have  the  same  function  values. 
Orbits  1  and  2  in  Example  3.6  above  are  dihedral  symmetric  on  6  variables.  These 
functions  are  a  subset  of  rotation  symmetric  functions.  In  the  case  of  n=4,  all  rotation 
symmetric  functions  are  also  dihedral  symmetric.  This  property  is  a  good  way  to  break 
down  a  large  set  of  Boolean  functions  into  a  much  smaller  set.  For  functions  on  8 
variables,  there  are  2  rotationally  symmetric  functions,  but  only  2  dihedral  symmetric 
functions.  This  group  is  just  1.5%  of  the  rotation  symmetric  functions.  Some  bent 
functions  are  dihedral  symmetric. 

5.  Balance 

The  best  functions  for  cryptography  are  balanced  [10];  they  have  the  same 
number  of  Is  as  Os.  The  problem  is  that  there  are  no  bent  functions  that  are  balanced. 
Looking  at  balanced  functions  that  are  nearly  bent  is  an  interesting  topic.  For  example, 
the  2  ~  dihedral  symmetric  functions  of  6  variables  were  tested  for  nonlinearity  and 
balance.  The  highest  possible  nonlinearity  for  a  bent  function  with  6  variables  is  28.  The 
highest  nonlinearity  of  a  balanced  function  in  the  tested  group  was  24.  These  functions 
could  be  altered  in  certain  ways  to  make  them  more  useful  for  cryptography.  They  are 
listed  in  Appendix  C.2. 
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IV.  COMPUTATION  AND  ANALYSIS 


A.  THE  SRC-6  RECONFIGURABLE  COMPUTER 


The  SRC-6  reconfigurable  computer  in  Spanagel  Hall  at  the  Naval  Postgraduate 
School  is  the  computation  tool  used  for  this  thesis.  It  provides  greater  flexibility  to 
control  compilation  than  a  traditional  PC.  It  is  composed  of  two  PCs,  each  with  a 
Pentium  IV  microprocessor,  five  Multi-Adaptive  Processing  (MAP)  boards  each 
containing  three  Xilinx  Virtex-2  XC2V6000  FPGAs,  two  for  computing  and  one  for 
control  as  well,  as  24  MB  of  On  Board  Memory  (OBM).  These  boards  are  connected  by 
a  high-bar  switch.  There  are  four  8  GB  banks  of  common  memory.  The  SNAP  port  can 
send  data  from  the  microprocessor  to  the  MAP  at  a  maximum  speed  of  1400  MB/s. 
Figure  1 1  shows  the  setup. 
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Figure  11.  Layout  of  the  SRC-6  (From  [17]) 


There  are  several  files  required  to  run  a  program  on  the  SRC-6.  It  can  compile 
code  to  execute  either  on  the  Intel  processor  or  on  the  MAP.  The  files  created  are  linked 
to  create  a  single  executable.  Intel  targeted  files  compile  to  a  .o  file  and  MAP-targeted 
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files  compile  using  the  Map  C  Compiler  (MCC).  main.c  is  written  in  C  and  calls  a 
subroutine,  where  the  bulk  of  the  computation  is  done.  The  main.c  file  usually  fonnats 
and  displays  the  output  and  can  send  inputs  to  the  subroutine,  subr.mc  is  also  written  in 
C,  and  runs  on  the  MAP.  It  can  call  macros,  either  user-created  or  built-in.  Local 
memory  and  On  Board  Memory  can  be  used  for  data  storage.  There  are  6  hanks  of 
usable  OBM,  each  capable  of  storing  523,776  64-bit  words.  The  FPGA  has  144  Block 
RAM  units;  each  unit  can  store  2048  bytes.  These  units  are  dual  ported  so  that  a  read  and 
a  write  can  occur  simultaneously. 

A  user  macro  is  written  in  Verilog  or  VHDL  and  specifies  circuits  in  the  FPGA. 
This  is  usually  where  the  major  computations  occur.  The  macro  can  be  called  millions  of 
times  in  the  subroutine.  It  can  be  pipelined  to  increase  throughput,  a  major  advantage 
over  a  PC.  The  pipelined  characteristic  allows  one  computation  per  clock  cycle,  after  the 
first  computation  is  complete.  Any  user-defined  macro  needs  a  blk.v  file  and  an  info  file 
to  describe  input  and  output  names  as  well  as  list  the  macro  characteristics. 

B.  USING  THE  SRC-6 

The  SRC-6  was  extremely  useful  for  computing  the  nonlinearity  of  millions  of 
functions.  The  subroutine  generally  used  a  counter  where  each  number  in  the  counter 
was  sent  into  a  macro  that  created  a  function  to  be  tested.  The  function  was  then  tested 
for  its  nonlinearity.  The  nonlinearity  value  was  sent  back  to  the  subroutine  and  stored  in 
a  histogram  that  counted  the  number  of  functions  with  each  nonlinearity.  The  functions 
could  also  be  sent  into  the  subroutine  to  be  tested  for  degree,  homogeneity,  and  balance. 

1.  Limita  tions 

The  main  limitation  of  the  SRC-6  was  the  speed  of  the  FPGA,  100  MHz.  A 
maximum  of  100,000,000  functions  can  be  tested  per  second.  It  takes  too  long  to  test  all 
functions  on  more  than  5  variables.  Because  of  this,  functions  had  to  be  divided  into 
smaller  groups  that  might  produce  interesting  results  in  the  search  for  bent  functions. 
The  largest  group  that  can  be  analyzed  with  the  current  speed  of  100  MHz  is  about  240 
functions.  This  would  take  around  3  hours  to  compute.  A  faster  FPGA  would  improve 
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this  number  drastically.  For  example,  a  500  MHz  FPGA  would  be  five  times  faster, 

42 

allowing  more  than  2  ~  functions  to  be  tested  in  the  same  amount  of  time. 

There  are  certain  designs  that  require  extensive  calculations  to  occur  within  one 
clock  cycle.  A  design  like  this  cannot  be  compiled  if  the  time  required  between  clock 
cycles  is  more  than  10  ns.  This  is  a  limitation  that  can  sometimes  be  worked  around  with 
smart  programming  techniques.  Verilog  code  can  be  written  behaviorally  or  structurally. 
Behavioral  code  uses  for  loops,  conditional  statements  and  function  calls.  Structural  code 
uses  simple  functional  blocks  connected  by  wires  to  other  blocks  that  perform  simple 
operations  on  each  clock  pulse  or  new  input.  Registers  store  values  that  can  be  recalled  or 
changed  on  a  clock  pulse.  Structural  code  can  be  more  efficient  for  an  FPGA.  The 
clocked  pipeline  splits  the  code  into  sections  and  allows  the  user  to  examine  the  longest 
path  and  to  make  improvements.  In  behavioral  code,  like  a  for  loop,  it  is  more  difficult  to 
determine  where  the  slowdown  might  occur  and  just  as  difficult  to  split  up  complicated 
actions  into  pipelined  steps.  An  extremely  well  written  structural  code  used  to  compute 
the  nonlinearity  of  a  function  was  created  by  another  graduate  student  and  a  modified 
version  appears  in  Appendix  A.  1.6.  This  code  reduced  the  latency  so  that  simple  actions 
could  be  completed  on  each  clock  cycle.  The  number  of  variables  could  be  increased 
because  even  though  the  functions  were  longer  and  required  more  resources  to  test,  the 
steps  were  clocked  in  a  way  that  allowed  each  step  to  be  completed  within  10  ns.  As  n 
grows;  however,  adjustments  will  need  to  be  made  and  there  will  still  be  a  limit  on  n  due 
to  the  limitation  of  the  FPGA  speed. 

Another  limitation  of  the  SRC-6  is  the  amount  of  space  available  for  hardware 
design  on  the  FPGA.  As  the  number  of  variables  in  a  function  grows,  the  space  required 
for  the  nonlinearity  circuit  grows.  If  there  are  no  ways  to  reduce  the  circuit  and  get  the 
same  results,  then  there  will  be  a  limit  on  n  for  the  nonlinearity  circuit.  More  than  one 
FPGA  can  be  used  for  the  same  circuit;  however,  this  has  not  been  attempted  for  the 
nonlinearity  circuit. 
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2.  Advantages 

One  advantage  of  the  SRC-6  is  its  built-in,  or  callable,  macros  [22],  A  test  was 
performed  to  show  the  difference  that  a  well-coded  callable  macro  can  make  over  a  user- 
defined  macro.  The  macro  pop_count64  is  designed  to  receive  a  string  of  64  zeros  and 
ones  and  output  the  number  of  ones  in  that  string.  The  user-defined  macro  ones_count 
performs  the  same  operation,  but  requires  the  extra  files  needed  in  a  user-defined  macro. 
The  space  used  on  the  FPGA  is  about  the  same,  but  the  frequency  required  to  get  the 
result  from  ones_count  was  much  less  than  100  MHz.  This  can  cause  incorrect  output 
since  the  FPGA  always  runs  on  a  100  MHz  clock.  Ones  count  is  also  more  complex 
since  it  required  two  extra  logic  levels.  The  timing  constraints  are  listed  in  Table  5.  Both 
tests  were  run  on  2  64-bit  Boolean  functions.  For  64-bit  functions,  the  pop_count64 
macro  is  more  efficient  than  ones  count. 


Constraint 

(pop  count64) 

1  Requested 

1  Actual 

1 

1  Logic 

1  Levels 

TS_CLOCK  = 

PERIOD  TIMEGRP 

"CLOCK" 

10  ns  H 

1  10.000ns 

1  9.598ns 

1  8 

IGH  50% 

i 

1 

1 

Constraint 

(ones  count) 

1  Requested 

1  Actual 

1 

1  Logic 

1  Levels 

*  TS  CLOCK  = 

PERIOD  TIMEGRP 

"CLOCK" 

10  ns  H 

1  10.000ns 

|  11.704ns 

1  10 

IGH  50% 


Table  5.  Comparison  of  Timing  Specifications  Between  Macros  with  the  Same 

Functionality 

Ones  count  has  one  advantage  in  that  it  can  be  parameterized  to  work  for  any  n. 
Pop_count64  only  works  for  n<  6.  Another  difference  is  that  the  pop_count64  macro 
can  only  be  called  in  the  subroutine,  but  ones_count  can  be  called  from  within  another 
macro.  The  advantage  of  ones_count  is  that  the  input  value  can  be  more  than  64  bits, 
n>6.  In  the  module  ones_count,  there  is  a  case  statement  that  chooses  which  operation  to 
perform  based  on  n. 
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module  Ones_Count  (TT,  Count) ; 


//  Ones_Count.v  -  A  program  to  count  the  Is  in  an  input  // 

//  “  // 
//  Created:  August  18,  2007  // 

//  Last  Modified:  October  27,  2008  // 

//  Author:  Jon  T.  Butler  // 

//  Modified  by:  Jennifer  Shafer  // 

//  Inputs:  TT  n-variable  Truth  Table  2An  bits  // 

//  Outputs:  Count  Number  of  Is-  n+1  bits  // 

//  // 


parameter  n=6; 
parameter  B=2**n; 
input [B-l : 0 ]  TT; 
output [n:0]  Count; 
reg[n:0]  Count; 
always  @ (TT) 
begin:  CHECK_n 

case (n)  //  case  statement  for  n=2  through  n=6 

2:  Count  =  Count2 (TT) ; 

3:  Count  =  Count2  (TT  [7 : 4] )  +  Count2  (TT  [3 : 0] ) ; 

4:  Count  =  Count2 (TT [ 15 : 12 ] )  +Count2 (TT [11 : 8] )  + 

Count2 (TT [7 : 4] )  +  Count2 (TT  [3 : 0] ) ; 

5:  Count  =  Count2  (TT  [31 :  28] )  +Count2  (TT  [27  :  24 ] )  + 

Count 2 (TT [23 : 20] )  +  Count2 (TT  [  1 9 : 1 6] )  +  Count2 (TT  [  15 : 12 ] ) 
+Count2 (TT  [11 : 8] )  +  Count2 (TT [7 : 4] ) +  Count2 (TT [3 : 0] ) ; 

6:  Count  =  Count2 (TT [63 : 60] )  +Count2(TT[  59:56])  + 

Count2 (TT [55 : 52] )  +  Count2 (TT  [51 : 48] )  +  Count2 (TT  [47 : 44] ) 
+Count2 (TT [43 : 40] )  +  Count2 (TT [39 : 36] )  +  Count2 (TT [35 : 32] ) 

+  Count2 (TT [31 :28] )  +Count2 (TT  [27 : 24 ] )  +  Count2  (TT  [23 : 20] ) 

+  Count2 (TT [19: 16] )  +  Count2 (TT [ 15 : 12 ] )  +Count2 (TT [11 : 8] )  + 
Count2 (TT [7 : 4] )  +  Count2 (TT  [3 : 0] ) ; 
default  Count  =  Count2 (TT) ; 
endcase 


end 


function  [2:0]  Count2; 
input  [3:0]  AA; 
begin:  f2 

Count2 [ 0 ] =AA [ 3 ] A AA [ 2 ] A AA [ 1 ] A AA [  0  ]  ; 

Count  2  [1]  =  (AA[3]  &AA[2]  |AA[3]&AA[1]  |AA[3]&AA[0]  |AA[2]&AA[1]  |AA[2]& 
AA [ 0 ]  |  AA  [  1  ]  &AA  [ 0 ]  )  (AA[3]  &AA[2]  &AA  [  1  ]  &AA[0]  )  ; 

Count2 [ 2 ] =AA [ 3 ] &AA [ 2 ] &AA  [  1 ] &AA  [  0 ]  ; 

end 

endfunction 

endmodule 


All  that  is  needed  in  the  code  above  to  get  the  correct  output  is  to  change  the 
parameter  n  to  the  desired  number  of  variables.  If  n>6,  lines  in  the  case  statement  can 
easily  be  added  to  count  higher  order  bits.  Unfortunately,  the  disadvantage  is  that  the 
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module  needs  to  be  completed  within  one  clock  cycle  and  as  n  increases,  the  number  of 
required  operations  doubles.  The  entire  calculation  for  n>5  cannot  be  computed  in  one 
clock  cycle  using  this  code. 

C.  ANALYSIS 

The  amount  of  data  able  to  be  computed  for  this  thesis  is  less  than  originally 
predicted.  The  code  written  initially  was  inefficient  and  resulted  in  compile  problems 
when  n>5.  New  code  was  written,  and  some  additional  results  were  generated.  The 
remainder  of  this  section  is  an  explanation  of  the  data  found  using  the  SRC-6. 

1.  Nonlinearity  of  Boolean  Functions  by  Degree  for  n=4 

There  are  2 16  Boolean  functions  on  4  variables.  Figure  12  shows  the  distribution 
of  these  functions  by  degree  and  nonlinearity.  There  are  896  bent  functions,  all  of  which 
are  of  degree  2.  There  are  32  affine  functions,  which  can  be  seen  in  the  figure  with 
nonlinearity  zero.  The  two  functions  of  degree  zero  are  0x0000  and  OxFFFF  in  truth 
table  form,  f(xx ,  x2 ,  x3 ,  x4 )  =  0  and  f(xx ,  x2 ,  x3 ,  x4 )  =  1  in  ANF .  The  3  0  functions  of 

degree  one  are  the  set  of  functions  where  all  combinations  of  terms  with  degree  zero  and 
one  are  found.  Figure  12  also  shows  that  the  nonlinearities  are  even  for  functions  with 
degree  2  or  3.  This  is  not  true  for  functions  with  degree  4.  In  this  case,  all  nonlinearities 
are  odd.  McEliece’s  Theorem  from  Coding  Theory  [23]  states  that,  for  a  Boolean 
function  of  degree  d  on  n  variables,  the  nonlinearity  is  always  divisible  by  2^" ,/H  .  This 
means  that  the  nonlinearity  will  be  even  as  long  as  d  .  This  can  be  seen  throughout 
all  sets  of  data. 
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Distribution  of  Functions  with  4  Variables  by 
Nonlinearity  and  Degree 


t/i 

c 

o 


u 

c 

3 


<L> 

XI 

E 

3 


32768 

1024 

32 

1 


Nonlinearity 


Degree 


Figure  12.  Distribution  of  Functions  with  4  Variables  by  Nonlinearity  and  Degree 


2.  Nonlinearity  of  Boolean  Functions  by  Degree  for  n=5 

Functions  on  odd  numbers  of  variables  do  not  contain  truly  bent  functions.  The 
results  are  shown  in  Figure  13,  however,  for  degrees  0  through  3.  The  higher  degrees 
could  not  be  computed,  because  there  are  almost  2  functions  of  degree  4,  and  2 
functions  of  degree  5.  The  highest  nonlinearity  found  is  12.  It  is  interesting  that  84  %  of 
the  functions  of  degree  2  have  the  highest  nonlinearity.  Only  20  %  of  the  functions  of 
degree  3  have  nonlinearity  12.  It  is  known  that  there  are  a  total  of  27,387,136  functions 
on  5  variables  with  nonlinearity  12.  The  functions  represented  in  degrees  2  and  3  make 
up  about  half  of  the  total. 
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Figure  13.  Distribution  of  Functions  with  5  Variables  by  Nonlinearity  and  Degrees  0 

through  3 

3.  Nonlinearity  of  Boolean  Functions  by  Degree  for  n=6.  Degrees  Less 
Than  3  Only 

Since  Lemma  2.2  states  that  there  are  no  bent  functions  with  degrees  higher  than 
n/2,  we  know  that  there  are  no  bent  functions  on  6  variables  of  degree  4,  5,  or  6.  These, 
therefore,  do  not  need  to  be  evaluated.  All  242  functions  of  degree  3  could  not  be  tested 
in  a  reasonable  amount  of  time.  Table  6  shows  the  represented  nonlinearities  with  the 
number  of  functions  in  each  tested  degree.  From  the  equation  in  Theorem  2.2,  2n'1-2"/2~1 , 
the  maximum  nonlinearity  is  2  -2  =28.  It  is  known  that  there  are  a  total  of  5,425,430,528 
bent  functions  on  6  variables.  There  are  1,777,664  bent  functions  of  degree  2  so  the 
remaining  bent  functions  must  be  degree  3. 


Nonlinearity/Degree 

0 

1 

2 

0 

2 

126 

0 

16 

0 

0 

83,328 

24 

0 

0 

2,333,184 

28 

0 

0 

1,777,664 

Table  6.  Distribution  of  Functions  on  6  Variables  by  Nonlinearity  and  Degrees  0,  1,  and  2 
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4.  Nonlinearity  of  Homogeneous  Boolean  Functions  by  Degree  for  n=4 

There  are  96  homogeneous  functions  on  four  variables.  Twenty-eight  of  these  are 
bent.  This  is  3.125%  of  all  896  4-variable  bent  functions.  There  are  15  homogeneous 
functions  of  degree  1.  It  is  obvious  that  there  is  only  one  homogeneous  function  of 
degree  n,  since  there  is  only  one  term  of  degree  n.  In  Figure  14,  this  function  has 
nonlinearity  1,  which  is  simple  to  verify.  The  function  in  ANF,  f(xl,x2,xi,x4)  =  x1x2x3x4 

when  converted  to  a  TT,  becomes  0x8000.  This  function  is  not  affine  and  is  only  one  bit 
different  from  the  affine  function  0x0000,  so  its  nonlinearity  is  one. 


Distribution  of  Homogeneous  Functions  with  4 
Variables  by  Nonlinearity  and  Degree 


Figure  14.  Distribution  of  Homogeneous  Functions  with  4  Variables  by  Nonlinearity 

and  Degree 

5.  Nonlinearity  of  Homogeneous  Boolean  Functions  by  Degree  for  n=5 

Figure  15  shows  that  there  are  868  Boolean  functions  of  degree  2  and  highest 
nonlinearity  12.  There  are  also  15  Boolean  functions  of  degree  3  and  nonlinearity  12. 
An  A-class  of  functions  is  a  set  where  one  highest  nonlinearity  function  is  combined  with 
every  affine  function  to  form  2n+I  new  functions  with  highest  nonlinearity.  This  comes 
from  Lemma  2.1.  These  883  functions  can  therefore  be  used  to  fonn  55,629  more 
functions  with  nonlinearity  12.  There  also  exist  functions  with  terms  of  both  degree  2 
and  3  that  have  the  highest  nonlinearity.  These  functions  can  also  be  combined  with  the 
affine  functions  to  detennine  the  rest  of  the  highly  nonlinear  functions  on  5  variables. 
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These  functions  are  in  the  group  of  degree  3  functions  with  nonlinearity  12  from  section 
2.  The  number  of  this  set  of  functions  can  be  found  by  subtracting  the  15x64=960 
functions  that  only  have  terms  of  degree  0,  1,  and  3.  This  leaves  13,999, 104- 
960=13,998,144  functions  of  nonlinearity  12  that  have  terms  with  degrees  0,  1,  2,  and  3. 


Distribution  of  Homogeneous  Functions  of  5 
variables  by  Nonlinearity  and  degree 


Figure  15.  Distribution  of  Homogeneous  Functions  with  5  Variables  by  Nonlinearity 

and  Degree 


6,  Nonlinearity  of  Homogeneous  Boolean  Functions  by  Degree  for  n=6 

This  set  of  results,  shown  in  Figure  16,  shows  that  there  are  at  least  13,918  A- 
classes  of  bent  functions  for  n=6.  Other  A-classes  will  be  composed  of  functions  with 
terms  of  both  degree  3  and  degree  2. 
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Figure  16.  Distribution  of  Homogeneous  Functions  on  6  Variables  by  Nonlinearity 

and  Degree 

7.  Nonlinearity  of  Rotation  Symmetric  Boolean  Functions  by  Degree  for 
n=4 


Rotation  symmetric  functions  are  a  small  subset  of  all  functions.  Research  shows 
that  bent  functions  can  be  found  in  these  sets  [14],  [21].  For  n=4,  there  are  26  rotation 
symmetric  functions,  eight  of  which  are  bent.  All  eight  functions  are  listed  in  ANF  in 
Appendix  C.1.1. 


Distribution  of  Rotation  Symmetric  Functions  on 
4  Variables  by  Degree  and  Nonlinearity 


Nonlinearity 


Figure  17.  Distribution  of  Rotation  Symmetric  Functions  on  4  Variables  by  Degree 

and  Nonlinearity 
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8.  Nonlinearity  of  Rotation  Symmetric  Boolean  Functions  by  Degree  for 
n=5 


o 

There  are  2  rotation  symmetric  functions  on  5  variables.  It  is  interesting  that  the 
number  of  functions  in  each  group  with  nonlinearity  greater  than  12  in  Figure  18  is  a 
multiple  of  12.  The  functions  with  highest  nonlinearity  are  listed  in  Appendix  C.1.2. 


Distribution  of  Rotation  Symmetric  Functions  on 
5  Variables  by  Degree  and  Nonlinearity 
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Figure  18.  Distribution  of  Rotation  Symmetric  Functions  on  5  Variables  by  Degree 

and  Nonlinearity 

9.  Nonlinearity  of  Rotation  Symmetric  Boolean  Functions  by  Degree  for 
n=6 

There  are  2 14  rotation  symmetric  functions  on  6  variables.  In  this  set,  there  are  8 
bent  functions  with  degree  2  and  40  bent  functions  with  degree  3.  This  is  only  0.29  %  of 
the  function  set.  The  graph  in  Figure  19  shows  the  distribution.  The  functions  are  listed 
in  Appendix  C.1.3. 
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Distrubution  of  Rotation  Symmetric  Functions 
on  6  Variables  by  Nonlinearity  and  Degree 
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Figure  19.  Distribution  of  Rotation  Symmetric  Functions  on  6  Variables  by  Degree 

and  Nonlinearity 

10.  Nonlinearity  of  Homogeneous  Rotation  Symmetric  Boolean  Functions 
by  Degree  for  n=4 

To  find  the  rotation  symmetric  functions  that  are  homogeneous,  all  rotation 
symmetric  functions  were  converted  to  ANF  using  the  transeunt  triangle,  and  then  tested 
for  homogeneity.  If  the  function  was  homogeneous,  then  it  was  stored  along  with  its 
nonlinearity.  Figure  20  shows  the  results  for  n=4. 


Figure  20.  Distribution  of  Homogeneous  Rotation  Symmetric  Functions  on  4 

Variables  by  Degree  and  Nonlinearity 
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11.  Nonlinearity  of  Homogeneous  Rotation  Symmetric  Boolean  Functions 
by  Degree  for  n=5 

In  this  small  group,  the  highest  nonlinearity  functions  make  up  30%  of  the  entire 
set.  Figure  21  shows  the  distribution. 


Distribution  of  Homogeneous  Rotation 
Symmetric  Functions  on  5  Variables  by  Degree 
and  Nonlinearity 
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Figure  2 1 .  Distribution  of  Homogeneous  Rotation  Symmetric  Function  on  5 

Variables  by  Degree  and  Nonlinearity 


12.  Nonlinearity  of  Homogeneous  Rotation  Symmetric  Boolean  Functions 
by  Degree  for  n=6 

There  are  only  two  bent  functions  in  this  set  of  32  functions.  In  order  to  get  this 
set  of  functions,  all  rotation  symmetric  functions  had  to  be  formed,  converted  to  ANF 
using  the  transeunt  triangle  and  then  tested  for  homogeneity.  These  additional  tests 
require  more  time  than  just  computing  the  nonlinearity  of  the  group  of  2  rotation 
symmetric  functions  on  6  variables.  Both  groups  can  be  tested  at  the  same  time.  It  is 
interesting  to  know  what  rotation  symmetric  functions  are  homogeneous,  and  the 
transeunt  triangle  is  an  efficient  way  to  determine  this.  It  does  not,  however,  reduce  the 
number  of  functions  to  be  tested.  The  results  are  shown  in  Figure  22. 


44 


t/> 

c 

o 


c 

3 


<D 

-Q 

E 

3 


Distribution  of  Homogeneous  Rotation 
Symmetric  functions  on  6  Variables  by  Degree 
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Figure  22.  Distribution  of  Homogeneous  Rotation  Symmetric  Functions  on  6 

Variables  by  Degree  and  Nonlinearity 

13.  Nonlinearity  of  Dihedral  Symmetric  Boolean  Functions  by  Degree  for 
n=4 

One  way  to  reduce  the  number  of  rotation  symmetric  functions  is  to  test  only 
dihedral  symmetric  functions.  For  n=4  and  n=5  all  rotation  symmetric  functions  are 
dihedral  symmetric,  so  there  is  no  reduction.  This  data  is  the  same  as  the  data  in  section 
7. 


14.  Nonlinearity  of  Dihedral  Symmetric  Boolean  Functions  by  Degree  for 
n=5 

As  explained  in  the  previous  section,  this  data  is  the  same  as  the  data  in  section  8. 

15.  Nonlinearity  of  Dihedral  Symmetric  Boolean  Functions  by  Degree  for 
n=6 

Reducing  the  set  of  rotation  symmetric  functions  on  6  variables  to  only  those  that 
are  also  dihedral  symmetric  reduces  the  set  by  half.  Figure  23  shows  that  there  are  only 

n 

16  bent  functions  in  this  set  of  2  functions.  This  is  33  %  of  the  rotation  symmetric  bent 
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functions.  It  is  interesting  to  note  that  all  of  the  rotation  symmetric  bent  functions  of 
degree  2  are  dihedral  symmetric,  while  only  8  of  40  rotation  symmetric  bent  functions  of 
degree  3  are  dihedral  symmetric. 


Figure  23.  Distribution  of  Dihedral  Symmetric  Functions  on  6  Variables  by  Degree 

and  Nonlinearity 

16.  Nonlinearity  of  Homogeneous  Dihedral  Symmetric  Boolean  Functions 
by  Degree  for  n=4 

This  data  is  the  same  as  the  data  in  section  10. 

17.  Nonlinearity  of  Homogeneous  Dihedral  Symmetric  Boolean  Functions 
by  Degree  for  n=5 

This  data  is  the  same  as  the  data  in  section  1 1 . 

18.  Nonlinearity  of  Homogeneous  Dihedral  Symmetric  Boolean  Functions 
by  Degree  for  n=6 

The  issue  regarding  the  generation  of  this  test  set  is  the  same  as  that  in  section  12. 

All  dihedral  symmetric  functions  must  be  formed,  converted  to  ANF  and  then  tested  for 
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homogeneity.  This  group  cannot  be  generated  independently.  It  is  interesting  that  the 
only  two  homogeneous  rotation  symmetric  bent  functions  are  also  dihedral  symmetric. 
The  distribution  is  shown  in  Figure  24. 


Distribution  of  Homogeneous  Dihedral  Symmetric 
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Figure  24.  Distribution  of  Homogeneous  Dihedral  Symmetric  Functions  on  6 

Variables  by  Degree  and  Nonlinearity 


The  compiled  data  demonstrates  the  utility  of  the  SRC-6  computer  system.  It  was 
found  to  be  an  excellent  way  to  search  Boolean  functions  using  several  methods.  The 
nonlinearity  circuit  for  n=7  was  not  built  in  order  to  concentrate  on  building  the  circuit 
for  n=8.  The  following  section  discusses  problems  and  possible  solutions  for  the  circuit 
for  n=8. 

D.  OTHER  CONTRIBUTIONS 

1.  Functions  on  8  Variables 

The  code  to  determine  the  nonlinearity  of  8-variable  functions  was  created  and 
compiled.  The  resources  used  are  shown  in  Table  7.  This  is  a  large  portion  of  resources 
and  a  very  low  frequency.  Expected  results  could  not  be  calculated  for  any  group  of  8- 
variable  functions  because  the  frequency  was  lower  much  than  100  MHz. 
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Nonlinearity,  8  variables 

Number  of  Slice  Flip  Flops 

14,404  out  of  67,584 

21% 

Number  of  4-input  LUTs 

40,183  out  of  67,  584 

59% 

Number  of  occupied  Slices 

24,810  out  of  33, 792 

73% 

Number  of  Block  RAMs 

4  out  of  144 

2% 

Freq 

66.1MHz 

Table  7.  Resources  Used  for  Finding  Nonlinearity  on  8-Variable  Functions 

There  is  a  way  to  reduce  the  circuit,  using  less  FPGA  resources.  The  affine 
functions  are  all  the  linear  functions  and  their  complements.  The  relationship  between 
the  Hamming  distance  of  /  ©  a  and  /  ©  a  is  d(  f  ©  a)+d(f  ©  a )=2n.  If  only  half  of  the 
affine  functions  are  defined  and  used,  the  distance  of  the  other  half  can  be  detennined 
with  a  simple  comparison  and  subtraction  operation.  Then,  if  the  distance  of  each  /  ©  a 

is  less  than  2"~l  for  half  of  the  bits),  then  it  is  the  minimum  of  d(  f  ©  a  )  and  d(  f  ©  a  ).  If 
the  distance  is  greater  than  2  ,  then  the  minimum  distance  is  2n-  d(  f  ©  a).  Changing  the 
circuit  to  reflect  this  results  in  the  resources  used,  shown  in  Table  8. 


Nonlinearity,  8  variables 

Number  of  Slice  Flip  Flops 

13,587  out  of  67,584 

20% 

Number  of  4-input  LUTs 

31,092  out  of  67,  584 

46% 

Number  of  occupied  Slices 

20,218  out  of  33, 792 

59% 

Number  of  Block  RAMs 

4  out  of  144 

2% 

Freq 

65.8  MHz 

Table  8.  Resources  Used  for  Finding  Nonlinearity  on  8-Variable  Functions  using  a 

Minimized  Circuit  Design 

Table  8  shows  a  significant  reduction  in  FPGA  space  required  but  does  not 
change  the  frequency.  So  this  circuit,  as  it  is,  cannot  produce  reliable  results.  Some 
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future  modifications  to  expand  the  pipeline,  however,  could  fix  this  issue.  The  Verilog 
code  did  produce  correct  results  when  some  functions  were  tested  in  ModelSim. 

The  generation  of  this  code  in  structural  form  required  hundreds  of  calls  to  each 
major  component,  or  module,  of  the  nonlinearity  computation.  These  lines  can  be 
generated  for  the  user  using  a  simple  C-code.  This  will  easily  create  Verilog  code  to 
compute  nonlinearity  for  higher  n  in  future  work.  Following  the  structure  of  the  code  in 
Appendix  A.  1.6  for  n=6,  the  min2,  min4,  and  OC  modules  remain  the  same.  The  module 
count  will  call  OC  2n/4  times  and  add  each  result  to  get  the  number  of  Is  in  one  2"-bit 
function.  The  module  fit  does  most  of  the  computation.  It  enumerates  half  of  the  2n+I 
affine  functions  (the  other  half  are  the  complements  of  the  first  half),  performs  the 
Exclusive-Or  operation  on  the  input  function  and  each  affine  function,  and  then  calls  the 
count  module  with  each  result.  In  count,  the  number  of  Is  is  compared  with  2n/2,  and  if  it 
is  less  than  or  equal  to  this  number,  it  is  sent  as  the  output,  otherwise  the  difference 
between  2n  and  the  number  is  sent  as  output.  This  allows  for  both  the  affine  function  and 
its  complement  to  be  considered.  Next,  it  calls  a  series  of  min  modules  with  the  result  of 
four  count  calls  as  input,  and  outputs  the  minimum  count.  These  results  then  go  four  at  a 
time  into  min  modules  again  in  a  tree-like  fashion  until  the  minimum  of  all  count  outputs 
is  found.  The  result  is  the  nonlinearity  of  the  input  function.  The  C-code  to  generate  the 
modules  count  and  fit  for  any  n  is  included  in  Appendix  B.3.  Using  this  program  saves 
time  and  prevents  inadvertent  mistakes. 

The  section  of  code  that  must  be  computed  separately  is  the  generation  of  the 
affine  functions.  This  was  done  with  a  user-defined  macro  since  affine  functions  have 
more  than  the  standard  64  bits  for  n>6.  The  result  is  2"  Verilog  assignment  statements, 
one  for  each  of  half  the  affine  functions.  The  code  is  included  in  Appendix  A. 8.  There 
are  corrections  that  must  be  made  to  this  code  because  the  program  does  not  print  leading 
zeros  in  hexadecimal  numbers.  The  code  to  generate  affine  functions  uses  a  repetitive 
pattern.  There  are  missing  zeros  only  in  affine  functions  with  the  pattern  made  of  OxF 
and  0x0.  Because  of  the  spaces  left  in  between  the  patterns,  it  is  easy  to  locate  the 
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position  for  the  missing  zeros.  The  user  can  add  them  in  by  hand  and  delete  the  space 
since  it  will  cause  an  error.  When  this  is  complete,  the  user  can  copy  and  paste  the  affine 
assign  statements  into  the  code  results  from  above. 

2.  Parameterization 

Attempting  to  parameterize  code  so  that  programs  work  for  multiple  situations 
can  save  time,  but  it  can  also  disrupt  readability  for  future  users.  For  example, 
parameterized  code  was  written  to  generate  all  Boolean  functions  given  the  number  of 
variables  n,  the  desired  degree  d.  The  generation  of  only  homogeneous  functions  can  be 
done  with  a  simple  modification.  The  generated  function  can  then  be  transfonned  from 
ANF  to  TT  and  tested  for  nonlinearity.  When  this  code  is  parameterized,  several  sets  of 
data  can  be  tested  without  writing  a  new  program;  the  user  only  needs  to  change  the 
parameters.  The  algorithm  is  explained  here. 

The  code  given  in  Appendix  A. 6  shows  a  subroutine  that  calls  three  macros. 
Macro  l  creates  a  vector  indicating  the  indexes  in  a  function  that  have  the  specified 
degree.  If  n=4  and  d=2  the  vector  would  be  [0001011001101000]  with  the  MSB  of  the 
vector  on  the  left,  index  15.  A  bit  in  the  vector  that  is  a  1  represents  the  term  in  the  ANF 
of  the  4-variable  function  that  has  degree  2.  For  non-homogeneous  functions,  another 
vector  is  created  with  a  1  in  every  place  where  the  degree  of  the  term  is  less  than  or  equal 
to  d.  Following  the  above  example,  the  vector  would  be  [0001011001111111].  The 
macro  also  returns  lengthbuf  the  result  of  2  raised  to  either  the  number  of  ones  in  the  first 
vector  to  generate  homogeneous  functions  or  the  second  vector  to  generate  all  functions 
of  highest  degree  d.  The  subroutine  then  calls  macro_2  length- 1  number  of  times  using  a 
counter.  For  homogenous  functions,  the  macro  is  called  using  a  for  loop  as  a  counter  for 
the  input  starting  with  1  instead  of  zero  since  there  must  be  at  least  one  1  in  a  term  with 
the  desired  degree.  Macro_2  places  each  bit  of  the  counter  in  a  place  in  the  new  function 
where  there  is  a  1  in  the  representative  vector.  All  other  indices  in  the  vector  will  receive 
a  0.  For  non-homogeneous  functions,  there  is  a  nested  for  loop  where,  for  each  bit  in  the 
first  vector  that  is  a  1,  macro_2  is  called  length/2  times.  This  will  ensure  that  for  each 
tenn  of  degree  2,  all  possible  functions  are  formed.  The  resulting  function  is  in  Algebraic 
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Normal  Form.  It  is  converted  to  a  truth  table  using  the  transeunt  triangle  and  then  sent  to 
the  nonlinearity  module,  macro_3,  for  testing.  The  results  are  compiled  into  a  histogram 
and  the  output  is  given  in  main.c. 

Unfortunately,  the  SRC-6  compiler  gives  the  following  error  for  n=6,  d=l: 


Constraint 

1  Requested 

I  Actual 

1  Logic 

I 

1  Levels 

*  TS  CLOCK  =  PERIOD  TIMEGRP  "CLOCK"  10  ns  H 

IGH  50% 

1  10.000ns 

1  15.807ns 

1 

1  20 

1 

The  clock  on  the  FPGA  needs  to  run  at  10  ns  intervals.  In  this  example,  the  clock 
cannot  run  faster  than  15.8  ns  to  get  through  all  operations  in  a  given  clock  period.  The 
code  cannot  be  appropriately  mapped.  Using  the  non-parameterized  code,  a  simple 
mapper,  the  clock  was  able  to  get  through  its  operations  in  10  ns.  The  following  shows 
that  the  constraint  is  met: 


Constraint 

1  Requested 

1 

1  Actual 

1 

1  Logic 

1  Levels 

TS_CLOCK  = 
IGH  50% 

PERIOD 

TIMEGRP  "CLOCK" 

10  ns  H 

1  10.000ns 

1 

1  9.989ns 

1 

1  9 

1 

The 

above 

is  from  code 

written 

for  n=6, 

d=2.  This 

lesson  learned  was 

disappointing  because,  without  parameterization,  it  took  much  longer  to  write  code  and 
run  tests. 

3.  Circuit  Minimization-Reducing  Affine  Function  Comparators 

In  attempting  to  find  ways  to  make  the  search  for  bent  functions  more  efficient,  a 
look  at  affine  functions  was  interesting.  In  the  definition  of  nonlinearity,  it  is  specified 
that  a  test  function  must  be  compared  to  all  the  affine  functions.  The  reduction  the 
number  of  affine  functions  actually  compared  to  the  test  function  can  be  critical  to  the 
design  of  a  faster  or  smaller  circuit.  In  an  attempt  to  find  a  trend  on  the  nonlinearity  of 
functions  when  they  are  compared  to  only  a  subset  of  affine  functions  the  following 
results  were  discovered.  The  nonlinearity  was  run  on  all  4-variable  functions. 

Using  only  the  five  affine  functions  with  one  term  0,  xi,  X2,  X3,  X4  the  results  are 
shown  in  Table  9.  Although  6  is  the  maximum  nonlinearity,  many  functions  were 
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evaluated  at  higher  nonlinearities,  because  the  affine  functions  with  lower  Hamming 
distances  to  the  test  function  were  not  evaluated.  Of  the  19,941  functions  found  with 
nonlinearity  6,  it  is  known  that  896  of  these  are  the  actual  bent  functions. 


Nonlinearity 

Using  only  5 
affine  functions 

Actual  Result 

0 

5 

32 

1 

80 

512 

2 

600 

3840 

3 

2800 

17920 

4 

8681 

28000 

5 

17176 

14336 

6 

19941 

896 

7 

12345 

0 

8 

3546 

0 

9 

356 

0 

10 

6 

0 

Table  9.  The  SRC-6  Results  of  a  Nonlinearity  Circuit  that  Only  Compares  the  Test 

Function  Against  Five  Affine  Functions  for  n=4 

A  slight  alteration  can  be  made  to  evaluate  nonlinearity  based  on  the  five  affine 
functions  listed  above  and  their  complements.  The  results  are  listed  in  Table  10.  The 
circuit  modification  that  adds  the  test  of  the  complements  of  the  five  affine  functions  only 
very  slightly  increases  the  resources  used,  but  the  additional  comparison  in  the  count 
module  reduces  the  frequency  from  113.8  MHz  to  100.8  MHz.  This  frequency  is  still 
good  for  computations,  but  increasing  n  may  cause  problems. 
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Nonlinearity 

Using  only  10 
affine  functions 

Actual  Result 

0 

10 

32 

1 

157 

512 

2 

1174 

3840 

3 

5462 

17920 

4 

15480 

28000 

5 

23450 

14336 

6 

16045 

896 

7 

3665 

0 

8 

93 

0 

Table  10.  The  SRC-6  Results  of  a  Nonlinearity  Circuit  that  Only  Compares  the  Test 
Function  Against  Five  Affine  Functions  and  their  Complements  for  n=4 

Evaluating  the  above  results  leads  to  interesting  observations.  It  is  known  that 
there  are  only  two  distances  between  any  bent  function  and  any  affine  function.  For  n=4 
these  distances  are  6  and  10.  Because  the  sum  of  the  distance  of  /  ©  a  and  f  ®  a  is  2", 
if  the  distance  between  one  affine  function  and  a  test  function  is  6,  then  the  distance 
between  the  test  function  and  the  complement  of  that  affine  function  must  be  10.  Since 
we  know  the  highest  nonlinearity  for  even  n,  we  could  test  all  functions  against  only  a 
certain  set  of  affine  functions  and  if  the  distance  is  equal  to  the  highest  nonlinearity  or  2n- 
(highest  NL),  then  that  function  is  kept  and  further  tested;  otherwise,  it  is  not  a  bent 
function.  The  group  of  functions  kept  is  significantly  reduced  by  this  minimized  circuit, 
as  shown  in  Table  9.  The  minimum  distance  between  a  test  function  and  five  affine 
functions  was  set  as  the  nonlinearity.  In  this  choice  of  affine  functions,  it  is  known  that  if 
the  test  function  is  bent  the  distance  will  be  6.  From  this,  we  know  that  all  896  bent 
functions  are  included  in  the  set  of  19,941  functions  that  were  found.  Adding  the 
complements  of  the  five  functions  reduced  the  set  of  functions  found  with  nonlinearity  6 
to  16,045.  The  reduced  circuit  can  reduce  the  group  of  functions  that  need  to  be  tested 
against  the  entire  group  of  affine  functions,  therefore  reducing  the  amount  of  time  needed 
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to  test  all  functions.  The  distance  between  all  bent  functions  for  any  n  and  a  certain 
group  of  affine  functions  is  not  always  known;  however,  this  idea  can  be  used  to  reduce 
the  circuit  if  a  pattern  can  be  found. 

Another  way  to  further  reduce  the  circuit  is  to  test  several  functions  at  once. 
Since  the  nonlinearity  circuit  as  it  is  now  is  too  large  to  be  able  to  fit  multiple  circuits  on 
a  single  FPGA,  a  smaller  circuit  must  be  designed.  Using  the  idea  of  testing  against  only 
certain  affine  functions  brings  up  an  idea  where  2n+1( the  number  of  affine  functions)  test 
functions  can  begin  testing  on  the  same  clock  period,  each  function  being  Exclusive-Or’d 
with  a  different  affine  function.  If  the  distance  is  NLb  or  2"-NLb  ,  then  the  function 
should  be  further  tested;  otherwise,  it  is  not  a  bent  function  and  a  new  function  should 
start  the  testing  process.  This  circuit  seems  complicated  and  was  not  built  for  this  thesis. 
Future  work  on  this  idea  could  significantly  improve  Boolean  function  testing. 

4.  Circuit  Minimization-The  Transeunt  Triangle 

The  full  transeunt  triangle  uses  many  Exclusive-Or  operations,  and  the  SRC-6 
compiler  cannot  compile  the  transjri  module  for  n>8.  The  following  is  an  explanation 
of  the  significant  reduction  of  the  number  of  required  Exclusive-Or  operators  for  this 
module.  From  the  proof  of  the  transeunt  triangle  in  Chapter  III  and  Figure  5,  it  is  shown 
that  a  transeunt  triangle  for  n=k  can  be  made  from  two  triangles  of  size  n=k-l.  This  can 
be  further  reduced  by  forming  the  triangle  using  2"‘“  triangles  of  size  n=2,  where  the 
triangle  for  n=2  is  reduced  from  6  Exclusive-Or  operators  to  only  4.  This  results  in  an 
even  larger  reduction  in  the  number  of  Exclusive-Or  operators  required.  Consider  Figure 
25.  There  are  two  Exclusive-Or  operators  left  off  this  triangle,  but  examination  of  the 
figure  shows  that  they  are  redundant.  When  moving  from  the  first  level  of  operations  to 
the  second  level,  the  unnecessary  coefficients  are  cancelled  out.  In  this  reduced  triangle, 
the  redundant  operations  are  left  off. 

n  =  2 


Figure  25.  Reduced  Transeunt  Triangle  for  n=2 
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This  is  a  one  third  reduction  in  operators.  Following  the  logic  of  the  proof  in 
Chapter  III,  the  triangle  for  n=3  can  be  formed  by  placing  two  n=2  triangles  on  the  left 
side  and  adding  4  operators  in  between  to  connect  all  inputs  to  the  upper  triangle.  This  is 
shown  in  Figure  26. 


do  d]  d2  di  d4  ds  do  dj 
Figure  26.  Reduced  Transeunt  Triangle  for  n=3 


The  generalization  of  this  reduction  is  that  an  n=k  reduced  triangle  can  be  formed 
recursively  using  2  n=k-l  triangles  connected  with  2n~l  operators.  Mathematically,  the 
full  transeunt  triangle  has  the  following  number  of  Exclusive-Or  operators: 

r\n  i\  o2  n  ^ n 

F(n )  =  J  =  C(2" ,  2) 

4” 

F(n)  ^  —  2-input  exclusive-OR  operators 
The  reduced  triangle  has: 

R(n)  =  2R(n  - 1)  +  2"  1  2-input  exclusive-OR  operators 

Solving  this  Linear  Recurrance  Relation  yields: 

R(n)  =  n-2n~l 

As  n  increases,  the  reduction  in  Exclusive-Or  operations  increases  significantly. 
Table  11  shows  this  for  several  n.  For  example,  the  number  of  Exclusive-Or  operators 
required  for  the  full  transeunt  triangle  for  n=l  1  is  about  2,000,000,  where  the  number  of 
operators  required  for  the  reduced  triangle  is  only  1 1,264.  This  is  a  99.4%  reduction. 
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n 

R(n) 

F(n) 

2 

4 

6 

3 

12 

28 

4 

32 

120 

5 

80 

496 

6 

192 

2,016 

7 

448 

8,128 

8 

1,024 

32,640 

9 

2,304 

130,816 

10 

5,120 

523,776 

11 

11,264 

2,096,128 

12 

24,576 

8,386,560 

Table  1 1 .  The  Difference  in  Number  of  Exclusive-Or  Operators  in  the  Reduced  Circuit  R(n) 

Versus  the  Full  Circuit  F(n) 


The  full  transeunt  triangle  for  n=9  could  not  be  compiled  on  the  SRC-6.  The 
combination  of  two  full  n=8  triangles  could  be  compiled  to  get  an  n=9  result,  and  the 
resources  were  listed  as  follows. 

Number  of  Slice  Flip  Flops:  9,155  out  of  67,584  13% 

Number  of  4  input  LUTs:  6,413  out  of  67,584  9% 

Number  of  occupied  Slices:  7,195  out  of  33,792  21% 

freq  =  98.5  MHz 

The  reduced  triangle  for  n=9  compiled  and  the  resources  are  listed  below.  While 
the  percent  of  Slice  Flip  Flops  increases  slightly,  the  number  of  FUTs  and  Slices  are 
reduced.  Also  the  frequency  is  back  above  100  MHz,  the  desired  minimum.  The 
increase  in  flip  flops  is  expected  because  the  reduced  triangle  has  a  longer  pipeline.  The 
flip  flops  are  used  to  ensure  data  is  clocked  properly. 
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Number  of  Slice  Flip  Flops:  1 1,658  out  of  67,584  17% 

Number  of  4  input  LUTs:  4,150  out  of  67,584  6% 

Number  of  occupied  Slices:  6,717  out  of  33,792  19% 

freq  =  101.9  MHz 

The  reduced  transeunt  triangle  can  be  used  to  convert  functions  with  specific 
degree  to  a  truth  table,  so  that  they  may  be  studied.  The  code  is  included  in  Appendix 
A.4.  At  this  time,  the  nonlinearity  circuit  on  the  SRC-6  cannot  compute  functions  with 
more  than  7  variables.  If  the  nonlinearity  circuit  can  be  optimized  for  greater  n,  the 
reduced  transeunt  triangle  will  help  find  more  groups  of  Boolean  functions. 
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V.  CONCLUSIONS  AND  RECOMMENDATIONS 


A.  CONCL  USIONS 

A  distribution  of  function  groups  based  on  their  properties  was  accomplished  in 
this  thesis.  This  work  was  limited  by  the  capacity  and  speed  of  the  Xilinx  Virtex  2 
FPGA.  Table  12  shows  a  summary  of  results. 


n 

All 

Bent 

Homogeneous 

Bent 

ROTS 

Bent 

Homog  ROTS 

Bent 

4 

216 

896 

96 

28 

26 

8 

8 

2 

5 

232 

27,387,136 

2,111 

883 

28 

36 

10 

3 

6 

264 

5,425,430,528 

1,114,238 

13,918 

214 

48 

32 

2 

n 

Dihedral 

Bent 

Homog  Dihedral 

Bent 

Balanced 

ROTS  Bal 

Bent  Bal 

4 

26 

8 

8 

2 

12,870  (213'5) 

6 

0 

5 

28 

36 

10 

3 

601,080,390  (229) 

40 

0 

6 

213 

16 

26 

2 

1.83E+18  (260) 

504 

0 

Table  12.  Summary  of  Computational  Results 


The  table  shows  the  total  number  of  functions  in  a  group  followed  by  the  number 
of  bent  functions  in  that  group  for  each  n.  The  second  to  last  column  shows  rotation 
symmetric  functions  according  to  balance.  The  graphs  shown  in  this  thesis  demonstrate 
that  the  SRC-6  is  a  good  platform  with  which  to  test  Boolean  functions.  It  also  indicates 
that  the  code  works  correctly  based  on  comparisons  to  previously  known  data. 

The  implementation  of  the  transeunt  triangle  was  extremely  beneficial  for 
examining  properties  of  functions  and  also  for  creating  groups  of  functions.  Although  the 
transeunt  triangle  is  generally  accepted  as  true,  no  proof  was  previously  known;  however, 
a  proof  is  included  in  this  thesis.  The  use  of  the  Synplicity  Pro  compiler  for  SRC-6 
programs  was  a  great  tool  in  examining  the  layout  of  the  circuit  designed  by  the  user. 
Being  able  to  trace  the  longest  path  helped  to  decide  where  to  modify  or  simplify  a 
circuit.  This  led  to  significant  improvements  in  circuit  design,  including  the  reduced 
transeunt  triangle. 
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Included  in  the  Appendices  are  several  sets  of  code  that  will  aid  future  students  in 
continuing  this  research.  If  the  SRC-6  is  upgraded  or  other  groups  of  functions  are 
predicted  to  have  high  nonlinearity,  students  can  use  the  included  code  as  a  tool.  If 
further  advances  can  be  made,  the  nonlinearity  circuit  for  n=10  or  higher  could  be 
implemented. 

B.  RECOMMENDATIONS 

Several  ideas  on  how  to  improve  upon  the  work  presented  in  this  thesis  have  been 
discussed.  There  are  several  options  to  enhance  the  SRC-6’s  effectiveness.  There  are 
two  programmable  FPGAs  on  each  MAP.  Currently,  only  one  is  used.  The  issue  with 
having  two  FPGAs  communicate  is  that  only  one  64-bit  value  can  be  passed  to  and  from 
the  FPGAs  at  a  time.  A  12-variable  function  is  4096  bits  long,  and  would  need  72  values 
passed  across  before  the  entire  function  could  be  reformed.  This  can  slow  the  pipelining 
process  considerably.  An  alternative  is  to  implement  separate  instantiations  of  the  same 
program,  one  on  each  FPGA.  This  way,  two  functions  per  clock  cycle  could  be  tested 
instead  of  one.  In  this  case  there  is  no  communication  between  FPGAs. 

Another  possible  improvement  is  to  reduce  the  circuitry  required  to  perform  the 
same  operations.  Possible  ideas  for  this  were  discussed  in  this  thesis,  including  not 
testing  each  function  against  all  affine  functions,  but  against  only  a  subset.  Every 
reduction  in  the  number  of  affine  functions  tested  reduces  the  required  circuit  size.  Other 
possibilities  include  not  counting  all  the  ones  in  every  function,  or  not  testing  certain 
groups  for  the  minimum  value,  in  the  minimization  circuit.  These  ideas  require  a  search 
for  trends  or  patterns  since  the  nonlinearity,  as  it  is  defined,  would  not  be  fully  computed. 
For  instance,  the  chosen  subset  of  affine  functions  must  be  able  to  correctly  predict  the 
nonlinearity  of  the  test  functions,  or  perhaps  give  a  range  of  its  nonlinearity. 

There  are  several  possibilities  in  designing  a  circuit  that  tests  several  Boolean 
functions  at  one  time.  One  idea  is  a  pipelined  ladder,  where  functions  are  tested  against 
one  affine  function,  evaluated  for  possible  high  nonlinearity,  then  either  dropped  out  or 
moved  to  the  next  affine  function.  At  every  clock  cycle,  a  new  function  would  enter  the 
ladder.  The  pipeline  would  be  much  longer  than  the  current  circuit  but  could  produce 

more  functions  of  interest  per  clock  cycle.  A  similar  idea  is  to  develop  a  circular 
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pipeline,  where  2"+1  functions  are  tested,  each  against  a  different  affine  function, 
evaluated  for  interest,  then  either  dropped  out  or  continued  around  the  circle.  Every  time 
a  function  drops  out  a  new  one  will  enter  the  circle  in  its  place.  This  circuit  may  be 
complicated  but  could  result  in  a  speed-up  when  testing  larger  groups  of  functions. 

The  addition  of  FPGAs  with  higher  clock  frequency  into  the  SRC-6  would  speed 
up  the  number  of  computations  per  second  without  having  to  change  any  code.  This 
would  only  be  a  slight  increase,  however,  and  not  a  permanent  solution.  The  ability  to 
find  smaller  groups  of  test  functions  than  the  ones  studied  in  this  thesis  would  help 
increase  the  number  of  variables  that  can  be  tested.  The  higher  the  number  of  variables 
in  a  function,  the  more  truth  table  inputs  there  are,  thus  the  more  bit-by-bit  operations  that 
need  to  be  completed  per  function.  This  slows  down  the  pipeline  causing  problems  with 
frequency.  This  can  be  examined  and  solutions  found  to  accomplish  each  operation  in 
only  one  clock  cycle.  A  longer  pipeline  may  be  needed  here,  but  this  does  not  hurt  the 
overall  throughput  if  one  function  per  clock  cycle  can  still  be  tested. 

The  circuitry  for  the  transeunt  triangle  was  reduced  significantly  in  this  thesis.  It 
can  be  further  reduced  considering  the  following.  It  is  common  when  generating  specific 
groups  of  functions,  especially  in  ANF,  that  several  inputs  will  be  zero.  For  example,  if 
generating  only  6-variable  functions  with  only  terms  of  degree  3,  at  least  44  of  the  64 
inputs  will  contain  a  zero  (the  number  of  terms  that  are  not  degree  3).  If  converting  from 
the  ANF  of  a  function  to  the  truth  table,  and  only  functions  with  specific  degree  will  be 
tested,  several  zeros  will  be  input  into  the  transeunt  triangle  in  known  locations. 
Depending  on  where  the  zeros  are  in  the  input,  several  reductions  could  be  made.  For 
example,  if  bits  4,  5,  6,  and  7  in  an  8-bit  input  are  all  zero,  then  the  result  is  just  the 
transeunt  triangle  of  inputs  0,1, 2, 3  two  times  since  x®0  =  v.  This  realization  reduces 
the  pipeline  as  well  as  the  number  of  Exclusive-Or  operators.  Further  study  of  similar 
patterns  based  on  the  characteristics  of  the  input  can  be  made. 

The  overall  recommendation  is  to  1)  reduce  the  size  of  the  circuitry  required  to 
run  the  program,  i.e.,  the  nonlinearity  design  or  the  transeunt  triangle,  2)  discover  trends 
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in  specific  properties  and  test  smaller  groups  of  functions,  3)  further  pipeline  the  circuit 
so  that  longer  Boolean  functions  can  be  tested,  and  4)  expand  the  capabilities  of  the  SRC- 
6  reconfigurable  computer. 
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APPENDIX  A.  SRC-6  CODE 


The  following  includes  code  used  to  detennine  properties  of  functions  including 
nonlinearity,  degree,  homogeneity,  rotation  symmetry,  dihedral  symmetry.  There  are  five 
major  files  required  to  run  code  on  the  SRC-6.  They  are  makefile,  main.c,  subr.mc, 
mymacro.v,  info,  and  block.v.  The  last  three  files  are  only  required  if  a  user  macro  is 
being  implemented.  The  makefile  guides  the  compiler  to  the  location  of  the  files  it  needs. 
It  is  standard  in  most  cases  so  only  one  sample  is  included  here.  Some  of  the  user 
defined  macros  are  included  without  supporting  files,  when  emphasis  is  on  the  macro. 
For  code  that  is  parameterized,  only  one  instance  is  provided. 

A.l  NONLINEARITY  COMPUTATION  FOR  N=6,  DEGREE=2 
1.  Main.c 


•k  -k  -k  / 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 

*  *  *  / 

#include  <map.h> 

#include  <stdlib.h> 

#define  NUMBER  28  //Highest  nonlinearity  for  n=6 

//Initialization  of  subroutine 

void  subr  (  int64_t*,  int64_t*,  int  ) ; 

//  Main  establishes  arrays  and  sizes  and  calls  the  subroutine. 

//  Using  the  output  of  the  subroutine,  the  function  displays  the 

//  data  for  the  user. 

int  main  (int  argc,  char  *argv[] )  { 

//  Initialize  variables 
int  mapnum  =  0; 
int  i; 

int64_t  time_clk; 
int64_t  *b; 

//  Allocate  array  output  values 

b  =  (int64_t  *)  malloc  (NUMBER  *  sizeof  (int64_t) ) ; 

//  Allocate  the  map 

map_allocate  (mapnum) ; 


J'k-k-k'k-k-k-k-k-k-k'k-k-k-k'k-k-k-k-k'k-k-k-k-k-k-k-k-k-k-k'k'k-k-k-k'k-k-k'k'k-k-k-k-k-k-k'k-k-k-k-k'k-k-k-k-k-k-k-k-k-k-k'k'k-k-k'k'k-k-k 

/* 

/*  main.c  -C  program  to  run  an  SRC-6E  implementation  of  nonlin.v 
/* 

/*  Author:  Jennifer  Shafer 

/*  Created:  April  3,  2009 

/*  Last  modified:  August  10,  2009 

/* 

/*  Description:  This  file  calls  the  subroutine  then  returns  the 

/*  number  of  functions  on  6  variables  of  degree  2  with  each 

/*  nonlinearity. 

I  ■ k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k 
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//  Call  subroutine  subr.mc  on  the  MAP. 
subr  (  b,  &time_clk,  mapnum) ; 

//  Print  out  the  number  of  clocks, 
printf  ("%lld  clocks\n",  time_clk) ; 

//  Display  output  for  the  user, 
for  ( i=0 ;  i<NUMBER;  i++){ 

printf ( "Number  of  6-variable  functions  of  degree  2  and  nonlinearity 
%lld\n" ,  i,  b [ i ] ) ; 


//  Release  the  map  resources 
map_free  (1); 
exit ( 0) ; 

}//end  of  int  main  (int  argc,  char  *argv[] ) 


2.  Subr.mc 


/  - k'k'k'k'k-k'k'k-k'k'k'k'k-k'k'k-k-k'k'k'k-k'k'k-k'k'k'k'k'k'k'k-k'k'k'k'k 
/* 

/*  subr.mc  -  MAP  subroutine  to  produce  the  nonlinearity  of  degree  2 
/*  6-variable  functions. 


/*  Author:  Jennifer  Shafer 

/*  Created:  April  3,  2009 

/*  Last  modified:  August  10,  2009 

/* 

/*  Description:  This  program  calls  the  macro  nl6n  and  creates 

/*  a  histogram  of  nonlinearity  values  to  send  back  to  main.c. 

/* 

ft k-k-k-k-k-k-k-kic-k'k-k'k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k'k-k-k-k-k 

#include  <libmap.h> 

#define  NUMBER  28 
//  Subroutine  runs  on  the  map 

void  subr  (int64_t  b[],  int64_t  *time,  int  mapnum)  { 

//  Declare  OBM  banks  in  SRC-6 
OBM_BANK_B  (B,  int64_t,  NUMBER) 

//  Declare  variables 
int64_t  tO,  tl; 
uint64_t  iO,  il; 
uint64_t  oO; 
int  i,  sel; 
uint64_t  j,  m; 
int  k=0; 

int64_t  HO [NUMBER] ,  HI [NUMBER] ,  H2 [NUMBER] ,  H3 [NUMBER] ; 
read  timer (&t0) ; 


//  The  nested  for  loop  creates  two  counters  sent  into  the  mapper  section  of 
//  nl6n.v  to  form  a  function  of  degree  2. 
for  (m=l ;  m<32768;  m++) { 

#pragma  loop  noloop_dep 
for  (j=0;  j  <1 2  8 ;  j++)  { 
i0=m; 
il=i  ; 


my_nl6n(i0,  il,  &oO)  ; 

//  The  nonlinearity  computed  becomes  the  index  of  a  histogram.  The  output 
//  alternates  between  four  arrays  to  ensure  no  extra  clock  cycles  are  needed 
//  to  read  and  write  data, 
sel  =  k&3; 
if  (sel==0) 

HO [oO ] ++; 
if  (sel==l) 

HI [oO] ++; 
if  (sel==2) 

H2 [oO] ++; 
if  (sel==3) 

H3 [ oO ] ++; 

}  / /  end  inner  for  loop 
k++  ; 

}//  end  outer  for  loop 

//  The  four  arrays  are  added  together  to  form  the  final  output 
for  ( i=0 ;  i<NUMBER;  i++) 

B [ i] =H0 [ i ] +H1 [ i] +H2 [ i ] +H3 [ i] ; 

read_timer (&tl ) ; 

*time  =  (tl  -  tO) ; 

//  Return  functions  by  DMAing  TO  the  CPU 
DMA_CPU  (0BM2CM,  B,  MAP_OBM_str ipe (1 , "B" ) ,  b,  1,  NUMBER* si zeof ( int 64_t ) ,  0); 
wait_DMA  (0); 

}  //  End  subr.mc 

3.  Makefile 

#  $Id:  Makefile . template, v  1.13  2005/04/12  19:18:30  jls  Exp  $ 

# 

#  Copyright  2003  SRC  Computers,  Inc.  All  Rights  Reserved. 

# 

#  Manufactured  in  the  United  States  of  America. 

# 

#  SRC  Computers,  Inc. 

#  4240  N  Nevada  Avenue 

#  Colorado  Springs,  CO  80907 

#  (v)  (719)  262-0213 

#  (f)  (719)  262-0223 

# 

#  No  permission  has  been  granted  to  distribute  this  software 

#  without  the  express  permission  of  SRC  Computers,  Inc. 

# 

#  This  program  is  distributed  WITHOUT  ANY  WARRANTY  OF  ANY  KIND. 

#  - 

#  User  defines  FILES,  MAPFILES,  and  BIN  here 

#  - 

FILES  =  main.c 

MAPFILES  =  subr.mc 

BIN  =  main 

# - 

#  Multi  chip  info  provided  here 

#  (Leave  commented  out  if  not  used) 

#  - 

#PRIMARY  =  <primary  file  1>  <primary  file  2> 

#SECONDARY  =  <secondary  file  1>  <secondary  file  2> 
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#CHIP2  =  <file  to  compile  to  user  chip  2> 

# - 

#  User  defined  directory  of  code  routines 

#  that  are  to  be  inlined 

#  - 

#INLINEDIR 

# - 

#  User  defined  macros  info  supplied  here 

# 

#  (Leave  commented  out  if  not  used) 

#  - 

MACROS  =  my_macro/nl 6n . v 

MY_BLKBOX  =  my_macro/bl k . v 

MY_NGO_DIR  =  my_macro 

MY_INFO  =  my_macro/info 

# - - - 

#  Floating  point  macros  selection 

#  - 

#FPMODE  =  SRC_IEEE_V1  #  Default  SRC  version  IEEE 

#FPMODE  =  SRC_IEEE_V2  #  Size  reduced  SRC  IEEE  with 

#  special  rounding  mode 

# - 

#  User  supplied  MCC  and  MFTN  flags 

#  - 

MCCFLAGS  =  -V  -keep 

MFTNFLAGS  =  -v 

# - 

#  User  supplied  flags  for  C  &  Fortran  compilers 

#  - 

CC  =  icc  #  icc  for  Intel  cc  for  Gnu 

FC  =  ifort  #  ifort  for  Intel  f77  for  Gnu 

#LD  =  ifort  -nofor_main  #  for  mixed  C  and  Fortran,  main  in  C 

#LD  =  ifort  #  for  Fortran  or  C/Fortran  mixed,  main  in  Fortran 

LD  =  icc  #  for  C  codes 

MYJCFLAGS 

MY_FFLAGS 

MY_LDFLAGS  =  #  Flags  to  include  libs  if  needed 

#  - - 

#  VCS  simulation  settings 

#  (Set  as  needed,  otherwise  just  leave  commented  out) 

#  - 

#USEVCS  =  yes  #  YES  or  yes  to  use  vcs  instead  of  vcsi 

#VCSDUMP  =  yes  #  YES  or  yes  to  generate  vcd+  trace  dump 

# - 

#  MODELSIM  simulation  settings 

#  (Set  as  needed,  otherwise  just  leave  commented  out) 

#  - 

#USEMDL  =  yes  #  YES  or  yes  to  use  modelsim  instead  of  vcs/vcsi 

#USEMDLGUI  =  yes  #  YES  or  yes  to  use  modelsim  GUI  interface 

#MDLDUMP  =  yes  #  YES  or  yes  to  generate  vcd  trace  dump 

# - 

#  No  modifications  are  required  below 

#  - 

MAKIN  ?=  $ (MC_ROOT) /opt /srcci /comp/ lib/AppRules .make 
include  $ (MAKIN) 
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blk.v  -  black-box  file  that  specifies  input  and  output 


/*  Author:  Jennifer  Shafer 

/*  Created:  May  15,  2009 

/*  Last  modified:  August  10,  2009 

/* 

/****************************************************** 
module  nl6n(val0,  vail,  CLK,  fitO) ; 

//  Initialize  input  and  output  variables  for  the  compiler 
input  CLK  /*  synthesis  syn_noclockbuf =1  syn_maxfan=100000  */  ; 
input  [14:0]  valO; 
input  [6:0]  vail; 
output  [6:0]fit0; 

endmodule 


5.  Info 


//***************************************************************** 

//* 

//*  info  -  info  file  to  specify  the  input  and  output  of  the  macros 
//* 

//*  Author:  Jennifer  Shafer 

//*  Created:  May  10,  2009 

//*  Last  modified:  August  10,  2009 

//* 

y/*************************************************-*^***-*^********* 

BEGIN_DEF  "my_nl6n" 

MACRO  =  "nl6n" ; 

STATEFUL  =  NO; 

EXTERNAL  =  NO; 

PIPELINED  =  YES; 

LATENCY  =  8; 

INPUTS  =  2: 

10  =  INT  16  BITS  (valO [14:0] ) 

11  =  INT  16  BITS  (vail [6:0] ) 


OUTPUTS  =  1 : 

O0  =  INT  32  BITS  (fit0[6:0]> 
IN  SIGNAL  :  1  BITS  "CLK"  =  "CLOCK"; 


END  DEF 


6.  nI6n.v  (After:  Ref  [16]) 


module  min4  (a,  b,  c,  d,  CLK,  z)  ; 
input  [6:0]  a,  b,  c,  d; 
input  CLK; 
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output  [6:0]  z; 
reg  [6:0]  z ; 

reg  [6:0]  alpha,  beta; 

always  @  (a,  b,  c,  d) 
begin 

alpha  <=  (a<b)?a:b; 
beta  <=  (c<d)?c:d; 

end 

always  @ (posedge  CLK) 
begin 

z  <=  (alpha<beta) ?alpha:beta; 

end 

endmodule 

module  OC  (TT,  Count) ; 

input [3:0]  TT;  //Only  4  of  8  bits  are  used, 

output [2:0]  Count;  //Only  3  of  8  bits  are  used, 
wire [2:0]  Count; 

assign  Count [ 0 ] =TT [ 3 ] ATT [2] ATT [1] ATT  [0] ; 

assign  Count [1]  =  (TT [3] &TT  [2]  |TT[3]&TT[1]  |TT[3]&TT[0] 

TT  [2]  &TT  [1]  |  TT  [2]  &TT  [0]  |  TT  [1]  &TT  [0]  )  &~  (TT  [3]  &TT  [2]  &TT  [1]  &TT  [0]  )  ; 
assign  Count [2] =TT [3] &TT [2] &TT  [1] &TT  [0]  ; 
endmodule 

module  count64 (TT,  CLK,  count) ; 
input  [63:0]  TT; 
input  CLK; 
output  [6:0]  count; 
reg  [6:0]  count; 
reg  [6:0]  cnt; 

reg  [4:0]  counta,  countb,  countc,  countd; 

wire  [2:0]  countO,  countl,  count2,  count3,  count4,  count5, 
count6,  count7,  count8,  count9,  countlO,  countll,  countl2,  countl3. 


countl4 , 

countl5; 

OC 

oO (TT [ 

3:0],  countO )  ; 

OC 

ol (TT [ 

7:4],  countl ) ; 

OC 

o2 (TT [ 

11 : 

8]  , 

count2 ) ; 

OC 

o3 (TT [ 

15: 

12]  , 

count3) ; 

OC 

o4 (TT [ 

19: 

16]  , 

count4 ) ; 

OC 

o5 (TT [ 

23: 

20]  , 

count5) ; 

OC 

06 (TT [ 

27  : 

24]  , 

count6) ; 

OC 

o7 (TT [ 

31 : 

28]  , 

count7 ) ; 

OC 

08 (TT [ 

35: 

32]  , 

count8 ) ; 

OC 

o9 (TT [ 

39: 

36]  , 

count9) ; 

OC 

olO (TT 

[43 

:  40] 

,  countlO) 

r 

OC 

oil (TT 

[47 

:  44  ] 

,  countll) 

r 

OC 

ol2 (TT 

[51 

:  4  8  ] 

,  countl2) 

r 

OC 

ol3 (TT 

[55 

:  52  ] 

,  countl3) 

r 

OC 

ol4 (TT 

[59 

:  56] 

,  countl4) 

f 

OC 

ol5 (TT 

[63 

:  60] 

,  countl5) 

r 
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always  @ (posedge  CLK) 
begin 

count a  <=count0+countl+count2+count3 ; 
countb  <=+count4+count5+count6+count7 ; 
countc  <=count8+count9+countl0+countl 1 ; 
countd  <=countl2+countl3+countl4+countl5 ; 
cnt  <=counta+countb+countc+countd; 
if(cnt<=32)  count=cnt; 
else  count=64-cnt ; 

end 

endmodule 

module  fit6n(TT,  CLK,  fit); 
input  [63:0]  Tib- 

input  CLK  /*  synthesis  syn  noclockbuf=l  syn  maxfan=100000  */  ; 
output  [6:0]  fit; 
wire  [6:0]  fit; 

wire  [63:0]  afns  [127:0]; 

reg  [63:0]  resO,  resl,  res2,  res3,  res4,  res5,  res6,  res7,  res8 
res9,  reslO,  resll,  resl2,  resl3,  resl4,  resl5,  resl6,  resl7,  resl8, 
resl9,  res20,  res21,  res22,  res23,  res24,  res25,  res26,  res27,  res28, 

res29,  res30,  res31; 

reg  [63:0]  res32,  res33,  res34,  res35,  res36,  res37,  res38, 
res39,  res40,  res41,  res42,  res43,  res44,  res45,  res46,  res47,  res48, 

res49,  res50,  res51,  res52,  res53,  res54,  res55,  res56,  res57,  res58, 

res59,  res60,  res61,  res62,  res63; 

wire  [6:0]  countsO,  countsl,  counts2,  counts3,  counts4,  counts5 
counts6,  counts7,  counts8,  counts9,  countslO,  countsll,  countsl2, 
countsl3,  countsl4,  countsl5,  countsl6,  countsl7,  countsl8,  countsl9, 

counts20,  counts21,  counts22,  counts23,  counts24,  counts25,  counts26, 

counts27,  counts28,  counts29,  counts30,  counts31; 

wire  [6:0]  counts32,  counts33,  counts34,  counts35,  counts36, 
counts37,  counts38,  counts39,  counts40,  counts41,  counts42,  counts43, 

counts44,  counts45,  counts46,  counts47,  counts48,  counts49,  counts50, 

counts51,  counts52,  counts53,  counts54,  counts55,  counts56,  counts57, 

counts58,  counts59,  counts60,  counts61,  counts62,  counts63; 

wire  [6:0]  min  1  0,  min  1  1,  min  1  2,  min  1_3,  min  1  4,  min  1  5 
min  1  6,  min  1  7,  min  1  8,  min  1  9,  min  1  10,  min  1  11,  min  1  12, 
min  1  13,  min  1  14,  min  1  15; 

wire  [6:0]  min  2  0,  min  2  1,  min  2  2,  min  2  3; 


assign  afns [ 0 ] =64 ' hO ; 
assign  afns [ 1 ] =64 ' haaaaaaaaaaaaaaaa; 
assign  afns[2]=64' hcccccccccccccccc; 
assign  afns [3] =64 'h6666666666666666; 
assign  afns [ 4 ] =64 ' hf Of Of Of Of Of Of Of 0 ; 
assign  afns [5] =64 'h5a5a5a5a5a5a5a5a; 
assign  afns [6] =64 'h3c3c3c3c3c3c3c3c; 
assign  afns [7] =64 'h9696969696969696; 
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assign  afns [ 8 ] =64 ' hf f OOf f OOf f OOf f 00 ; 
assign  afns [9] =64 'h55aa55aa55aa55aa; 
assign  afns [10] =64 'h33cc33cc33cc33cc 
assign  afns [11] =64 'h9966996699669966 
assign  afns [ 12 ] =64 ' hOf f OOf f OOf f OOf f 0 
assign  afns [13] =64 'ha55aa55aa55aa55a 
assign  afns [14] =64 'hc33cc33cc33cc33c 
assign  afns [15] =64 'h6996699669966996 
assign  afns [ 1 6] =64 ' hf f f f OOOOf f f f 0000 
assign  afns[17]=64' h5555aaaa5555aaaa 
assign  afns[18]=64' h3333cccc3333cccc 
assign  afns [ 1 9 ] =64 'h9999666699996666 
assign  afns [20] =64 ' hOfOff Of OOfOff Of 0 
assign  afns [21] =64 'ha5a55a5aa5a55a5a 
assign  afns [22] =64 'hc3c33c3cc3c33c3c 
assign  afns [23] =64 'h6969969669699696 
assign  afns [24 ] =64 ' hOOf f f f OOOOf f f f 00 
assign  afns[25]=64' haa5555aaaa5555aa 
assign  afns[26]=64' hcc3333cccc3333cc 
assign  afns [27] =64 'h6699996666999966 
assign  afns [28] =64 ' hfOOf OffOfOOf Off 0 
assign  afns [29] =64 'h5aa5a55a5aa5a55a 
assign  afns [30] =64 'h3cc3c33c3cc3c33c 
assign  afns [31] =64 'h9669699696696996 
assign  afns [ 32 ] =64 ' hfffff fff 00000000 
assign  afns[33]=64' h55555555aaaaaaaa 
assign  afns [34] =64 'h33333333cccccccc 
assign  afns [35] =64 'h9999999966666666 
assign  afns [36] =64 ' hOfOfOfOff OfOfOfO 
assign  afns [37] =64 'ha5a5a5a55a5a5a5a 
assign  afns [38] =64 'hc3c3c3c33c3c3c3c 
assign  afns [39] =64 'h6969696996969696 
assign  afns [40] =64 ' hOOffOOffffOOffOO 
assign  afns[41]=64' haa55aa5555aa55aa 
assign  afns[42]=64' hcc33cc3333cc33cc 
assign  afns [43] =64 'h6699669999669966 
assign  afns [ 44 ] =64 ' hfOOf f OOf Of f OOf fO 
assign  afns [45] =64 'h5aa55aa5a55aa55a 
assign  afns [46] =64 'h3cc33cc3c33cc33c 
assign  afns [47] =64 'h9669966969966996 
assign  afns [ 4 8 ] =64 ' hOOOOf f f f f f f f 0000 
assign  afns[49]=64' haaaa55555555aaaa 
assign  afns [50] =64 'hcccc33333333cccc 
assign  afns [51] =64 'h6666999999996666 
assign  afns [ 52 ] =64 ' hf Of OOf Of Of Of f Of 0 
assign  afns [53] =64 'h5a5aa5a5a5a55a5a 
assign  afns [54] =64 'h3c3cc3c3c3c33c3c 
assign  afns [55] =64 'h9696696969699696 
assign  afns [56] =64 ' hf f OOOOf f OOf fff 00 
assign  afns[57]=64' h55aaaa55aa5555aa 
assign  afns[58]=64' h33cccc33cc3333cc 
assign  afns [59] =64 'h9966669966999966 
assign  afns [ 60 ] =64 ' hOf f Of OOf f OOf Of f 0 
assign  afns [61] =64 'ha55a5aa55aa5a55a 
assign  afns [62] =64 'hc33c3cc33cc3c33c 
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assign  afns [63] =64 'h6996966996696996; 

count64 

cO(resO,  CLK,  countsO); 

count64 

cl(resl,  CLK,  countsl); 

count64 

c2(res2,  CLK,  counts2); 

count64 

c3(res3,  CLK,  counts3); 

count64 

c4(res4,  CLK,  counts4); 

count64 

c5(res5,  CLK,  counts5); 

count64 

c6(res6,  CLK,  counts6) ; 

count64 

c7(res7,  CLK,  counts7); 

count64 

c8(res8,  CLK,  counts8); 

count64 

c9(res9,  CLK,  counts9) ; 

count64 

clO (reslO, 

CLK, 

countslO) 

r 

count64 

ell  (resll. 

CLK, 

countsll) 

r 

count64 

cl2  (resl2 , 

CLK, 

countsl2 ) 

r 

count64 

cl3  (resl3. 

CLK, 

countsl3) 

r 

count64 

cl4 (resl4. 

CLK, 

countsl4 ) 

r 

count64 

cl5 (resl5. 

CLK, 

countsl5) 

r 

count64 

cl6 (resl6. 

CLK, 

countsl6) 

r 

count64 

cl7  (resl7 , 

CLK, 

countsl7 ) 

r 

count64 

cl8  (resl8. 

CLK, 

countsl8) 

r 

count64 

cl9  (resl9. 

CLK, 

countsl9) 

r 

count64 

c20  (res20. 

CLK, 

counts20) 

r 

count64 

c2 1 ( res2 1 , 

CLK, 

counts2 1 ) 

r 

count64 

c22 (res22 , 

CLK, 

counts22 ) 

r 

count64 

c23 (res23. 

CLK, 

counts23) 

r 

count64 

c24 (res24. 

CLK, 

counts24 ) 

r 

count64 

c25 (res25. 

CLK, 

counts25) 

r 

count64 

c26  (res26. 

CLK, 

counts26) 

r 

count64 

c27 (res27 , 

CLK, 

counts27 ) 

r 

count64 

c2  8 ( res2  8 , 

CLK, 

counts2  8 ) 

r 

count64 

c2  9 ( res2  9 , 

CLK, 

counts2  9 ) 

r 

count64 

c30 ( res30 , 

CLK, 

counts30 ) 

r 

count64 

c31 (res31. 

CLK, 

counts31 ) 

r 

count64 

c32 (res32 , 

CLK, 

counts32 ) 

r 

count64 

c33 ( res33 , 

CLK, 

counts33 ) 

r 

count64 

c34 ( res34 , 

CLK, 

counts34 ) 

r 

count64 

c35 (res35. 

CLK, 

counts35 ) 

r 

count64 

c36 (res36. 

CLK, 

counts36) 

r 

count64 

c37 (res37 , 

CLK, 

counts37 ) 

r 

count64 

c38 (res38. 

CLK, 

counts38 ) 

r 

count64 

c39  (res39. 

CLK, 

counts39) 

r 

count64 

c40 (res40. 

CLK, 

counts40) 

r 

count64 

c4 1 ( res4 1 , 

CLK, 

counts4 1 ) 

r 

count64 

c42 (res42 , 

CLK, 

counts42 ) 

r 

count64 

c43 (res43. 

CLK, 

counts43) 

r 

count64 

c44 (res44 , 

CLK, 

counts44 ) 

r 

count64 

c45 (res45. 

CLK, 

counts45) 

r 

count64 

c46  (res46. 

CLK, 

counts46) 

r 

count64 

c47  (res47 , 

CLK, 

counts47 ) 

r 

count64 

c4  8 ( res4  8 , 

CLK, 

counts4  8 ) 

r 

count64 

c49 (res49. 

CLK, 

counts49) 

r 

count64 

c50 ( res50 , 

CLK, 

counts50 ) 

r 

count64 

c51  (res51. 

CLK, 

counts51 ) 

r 

count64 

c52  (res52 , 

CLK, 

counts52 ) 

r 
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count64  c53(res53. 

CLK, 

counts53)  ; 

count64  c54(res54. 

CLK, 

counts54 ) ; 

count64  c55(res55. 

CLK, 

counts55) ; 

count64  c56(res56. 

CLK, 

counts56) ; 

count64  c57(res57. 

CLK, 

counts57 ) ; 

count64  c58(res58. 

CLK, 

counts58 ) ; 

count64  c59(res59. 

CLK, 

counts59) ; 

count64  c60(res60. 

CLK, 

counts60) ; 

count64  c61(res61. 

CLK, 

counts61 ) ; 

count64  c62(res62. 

CLK, 

counts62 )  ; 

count64  c63(res63. 

CLK, 

counts63)  ; 

min4 

ml  0(counts0,  countsl,  counts2,  counts3,  CLK,  min 

1  0) 

r 

min4 

ml  l(counts4,  counts5,  counts6,  counts7,  CLK,  min 

1  1) 

r 

min4 

ml  2(counts8,  counts9,  countslO,  countsll,  CLK,  min  1 

2)  ; 

min4 

ml  3(countsl2,  countsl3. 

countsl4,  countsl5,  CLK, 

min 

1  3)  ; 

min4 

ml  4  (countsl6,  countsl7. 

countsl8,  countsl9,  CLK, 

min 

14); 

min4 

ml  5(counts20,  counts21. 

counts22,  counts23,  CLK, 

min 

1  5)  ; 

min4 

ml  6(counts24,  counts25. 

counts26,  counts27,  CLK, 

min 

1  6)  ; 

min4 

ml  7(counts28,  counts29. 

counts30,  counts31,  CLK, 

min 

17); 

min4 

ml  8(counts32,  counts33. 

counts34,  counts35,  CLK, 

min 

18); 

min4 

ml  9(counts36,  counts37. 

counts38,  counts39,  CLK, 

min 

1  9)  ; 

min4 

ml  10(counts40,  counts41. 

counts42,  counts43,  CLK, 

min 

1  10)  ; 

min4 

ml  ll(counts44,  counts45. 

counts46,  counts47,  CLK, 

min 

\ — 1 
\ — 1 

\ — 1 

min4 

ml  12(counts48,  counts49. 

counts50,  counts51,  CLK, 

min 

1  12)  ; 

min4 

ml  13(counts52,  counts53. 

counts54,  counts55,  CLK, 

min 

1  13)  ; 

min4 

ml  14 (counts56,  counts57. 

counts58,  counts59,  CLK, 

min 

1  14)  ; 

min4 

ml  15(counts60,  counts61. 

counts62,  counts63,  CLK, 

min 

_ 1 _ 1 5 )  ; 

min4 

m2  0 (min  1  0,  min  1 

1,  min  1  2,  min  1  3,  CLK,  min 

2  0) 

r 

min4 

m2  1 (min  1  4,  min  1 

5,  min  1  6,  min  1  7,  CLK,  min 

2  1) 

r 

min4 

m2  2  (min  1  8,  min  1 

9,  min  1  10,  min  1  11,  CLK,  min  2 

2)  ; 

min4 

m2  3 (min  1  12,  min 

1_13, 

min  1  14,  min  1  15,  CLK, 

min 

2_3)  ; 

min4 

m3  0 (min  2  0,  min  2 

1,  min  2  2,  min  2  3,  CLK,  fit) 

r 

always  @ (posedge  CLK) 

begin 

resO  <=  TT  A 

afns  [ 

0] ; 

resl  <=  TT  A 

afns  [ 

l] ; 

res2  <=  TT  A 

afns  [ 

2] ; 

res3  <=  TT  A 

afns  [ 

3]  ; 

res4  <=  TT  A 

afns  [ 

4]  ; 

res5  <=  TT  A 

afns  [ 

5]  ; 

res6  <=  TT  A 

afns  [ 

6]  ; 

res7  <=  TT  A 

afns  [ 

7]  ; 

res8  <=  TT  A 

afns  [ 

8]  ; 

res9  <=  TT  A 

afns  [ 

9]  ; 

reslO  <=  TT 

A  afns 

[10]  ; 

resll  <=  TT 

A  afns 

[11]  ; 

resl2  <=  TT 

A  afns 

[12]  ; 

resl3  <=  TT 

A  afns 

[13]  ; 

resl4  <=  TT 

A  afns 

[14]  ; 

resl5  <=  TT 

A  afns 

[15]  ; 
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resl6 

<= 

TT  ' 

'  afns 

[16] 

resl7 

<= 

TT  ' 

'  afns 

[17] 

resl8 

<= 

TT  ' 

'  afns 

[18] 

resl9 

<= 

TT  ' 

'  afns 

[19] 

res20 

<= 

TT  ' 

'  afns 

[20] 

res2 1 

<= 

TT  ' 

'  afns 

[21] 

res22 

<= 

TT  ' 

s  afns 

[22] 

res23 

<= 

TT  ' 

'  afns 

[23] 

res24 

<= 

TT  ' 

'  afns 

[24] 

res25 

<= 

TT  ' 

v  afns 

[25] 

res2  6 

<= 

TT  ' 

'  afns 

[26] 

res27 

<= 

TT  ' 

'  afns 

[27] 

res2  8 

<= 

TT  ' 

s  afns 

[28] 

res2  9 

<= 

TT  ' 

'  afns 

[29] 

res30 

<= 

TT  ' 

'  afns 

[30] 

res31 

<= 

TT  ' 

'  afns 

[31] 

res32 

<= 

TT  ' 

'  afns 

[32] 

res33 

<= 

TT  ' 

s  afns 

[33] 

res34 

<= 

TT  ' 

'  afns 

[34] 

res35 

<= 

TT  ' 

'  afns 

[35] 

res36 

<= 

TT  ' 

"  afns [36] 

res37 

<= 

TT  ' 

'  afns 

[37] 

res38 

<= 

TT  ' 

'  afns 

[38] 

res39 

<= 

TT  ' 

'  afns 

[39] 

res40 

<= 

TT  ' 

'  afns 

[40] 

res4 1 

<= 

TT  ' 

'  afns 

[41] 

res42 

<= 

TT  ' 

'  afns 

[42] 

res43 

<= 

TT  ' 

'  afns 

[43] 

res44 

<= 

TT  ' 

'  afns 

[44] 

res45 

<= 

TT  ' 

'  afns 

[45] 

res4  6 

<= 

TT  ' 

'  afns 

[46] 

res47 

<= 

TT  ' 

'  afns 

[47] 

res4  8 

<= 

TT  ' 

s  afns 

[48] 

res4  9 

<= 

TT  ' 

'  afns 

[49] 

res50 

<= 

TT  ' 

'  afns 

[50] 

res51 

<= 

TT  ' 

'  afns 

[51] 

res52 

<= 

TT  ' 

'  afns 

[52] 

res53 

<= 

TT  ' 

'  afns 

[53] 

res54 

<= 

TT  ' 

'  afns 

[54] 

res55 

<= 

TT  ' 

'  afns 

[55] 

res56 

<= 

TT  ' 

'  afns [56] 

res57 

<= 

TT  ' 

'  afns 

[57] 

res58 

<= 

TT  ' 

'  afns 

[58] 

res59 

<= 

TT  ' 

'  afns 

[59] 

res60 

<= 

TT  ' 

'  afns 

[60] 

res61 

<= 

TT  ' 

'  afns 

[61] 

res62 

<= 

TT  ' 

s  afns 

[62] 

res63 

<= 

TT  ' 

s  afns 

[63] 

end 

endmodule 

module  mapn6d2 (COUNTER,  C0UNTER1,  ANF,  CLK) ;  //all 

input  [14:0]  COUNTER;  //counter  should  be  1  through (2 A15) -1  to  ensure 
at  least  one  term  of  deg  2  is  included 

input  [ 6 : 0 ] COUNTERl ;  //counter  should  be  0  through  (2A7)-1 


input  CLK; 

output [63:0]  ANF; 
wire  [14:0]  COUNTER; 

reg  [63:0]  ANF; 

always  @ (posedge  CLK) 

begin 

ANF 

[63] 

<= 

1 

'b0; 

ANF 

[62] 

<= 

1 

'b0; 

ANF 

[61] 

<= 

1 

'b0; 

ANF 

[60] 

<= 

1 

'b0; 

ANF 

[59] 

<= 

1 

'b0; 

ANF 

[58] 

<= 

1 

'b0; 

ANF 

[57] 

<= 

1 

'b0; 

ANF 

[56] 

<= 

1 

'b0; 

ANF 

[55] 

<= 

1 

'b0; 

ANF 

[54] 

<= 

1 

'b0; 

ANF 

[53] 

<= 

1 

'b0; 

ANF 

[52] 

<= 

1 

'b0; 

ANF 

[51] 

<= 

1 

'b0; 

ANF 

[50] 

<= 

1 

'b0; 

ANF 

[49] 

<= 

1 

'b0; 

ANF 

[48] 

<= 

COUNTER [ 

14];  //2 

ANF 

[47] 

<= 

1 

'b0; 

ANF 

[46] 

<= 

1 

'b0; 

ANF 

[45] 

<= 

1 

'b0; 

ANF 

[44] 

<= 

1 

'b0; 

ANF 

[43] 

<= 

1 

'b0; 

ANF 

[42] 

<= 

1 

'b0; 

ANF 

[41] 

<= 

1 

'b0; 

ANF 

[40] 

<= 

COUNTER [ 

13];  / / 2 

ANF 

[39] 

<= 

1 

'b0; 

ANF 

[38] 

<= 

1 

'b0; 

ANF 

[37] 

<= 

1 

'bO; 

ANF 

[36] 

<= 

COUNTER [ 

12];  H2 

ANF 

[35] 

<= 

1 

'bO; 

ANF 

[34] 

<= 

COUNTER [ 

11];  111 

ANF 

[33] 

<= 

COUNTER [ 

10];  m 

ANF 

[32] 

<= 

COUNTERl 

[6];  111 

ANF 

[31] 

<= 

1 

'bO; 

ANF 

[30] 

<= 

1 

'bO; 

ANF 

[29] 

<= 

1 

'bO; 

ANF 

[28] 

<= 

1 

'bO; 

ANF 

[27] 

<= 

1 

'bO; 

ANF 

[26] 

<= 

1 

'bO; 

ANF 

[25] 

<= 

1 

'bO; 

ANF 

[24] 

<= 

COUNTER [ 

9];  111 

ANF 

[23] 

<= 

1 

'bO; 

ANF 

[22] 

<= 

1 

'bO; 

ANF 

[21] 

<= 

1 

'bO; 

ANF 

[20] 

<= 

COUNTER [ 

8];  111 

ANF 

[19] 

<= 

1 

'bO; 

ANF 

[18] 

<= 

COUNTER [ 

7];  111 

ANF 

[17] 

<= 

COUNTER [ 

6];  111 

ANF 

[16] 

<= 

COUNTERl 

[5];  //I 

ANF 

[15] 

<= 

1 

'bO; 
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ANF 

[14] 

<0 

=  1'bO; 

ANF 

[13] 

<0 

=  1'bO; 

ANF 

[12] 

<0 

=  COUNTER [5] 

;  //2 

ANF 

[11] 

<0 

=  1'bO; 

ANF 

[10] 

<  = 

=  COUNTER [4] 

;  112 

ANF 

[9] 

<= 

COUNTER [3] ; 

// 2 

ANF 

[8] 

<= 

C0UNTER1 [4] 

;  //l 

ANF 

[7] 

<= 

1  'b0; 

ANF 

[6] 

<= 

COUNTER [2] ; 

112 

ANF 

[5] 

<= 

COUNTER [ 1 ] ; 

II 2 

ANF 

[4] 

<= 

C0UNTER1 [3] 

;  // 1 

ANF 

[3] 

<= 

COUNTER [0] ; 

// 2 

ANF 

[2] 

<= 

COUNTER1 [2] 

;  //l 

ANF 

[1] 

<= 

COUNTER1 [1] 

;  //l 

ANF 

[0] 

<= 

COUNTERl [0] 

;  //0 

end 

endmodule 


module  trans  tri(IN,  OUT,  CLK); 
parameter  n  =  6; 
localparam  N  =  2**n; 
output  [N— 1:0]  OUT; 
function . 

input  [N— 1:0]  IN; 
the  input  function. 

reg  [N-1:0]  EXOR  array 
transeunt  tringle  is 
input  CLK; 

integer  i ,  j  ; 

always  @ (posedge  CLK) 
begin 

EXOR__array  [ 0 ]  =  IN;  //Set  left  column  of 

EXOR  array  to  IN. 

for(i=l;  i<N;  i=i+l)  //Enumerate  a  level  in  the 

transeunt  triangle. 

begin 

for(j=0;  j<N;  j=j+l)  //Enumerate  a  position  in  the 

current  level. 

begin:  level 

if (j  <=  i  —  1 )  EXOR  array [i]  [j]  =  EXOR  array  ti¬ 
ll  [j]; 

else  EXOR_array [ i ] [j]  =  EXOR_array [ i-1 ] [j] 

EXOR_ar r ay [ i - 1 ]  [ j  — 1 ] ; 

end 

end 

end 

assign  OUT  =  EXOR  array[N-l]; 
endmodule 

module  nl6n(val0,  vail,  CLK,  fitO); 


//  Number  of  variables. 

//  Number  of  inputs  and  outputs. 

//  OUT  is  the  ANF  of  the  input 

//  IN  is  the  specified  truth  table  of 

[ N— 1:0];  //The  array  in  which  the 

//  embedded. 
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input  [14:0]  valO; 
input  [6:0]  vail; 

input  CLK  /*  synthesis  syn  noclockbuf=l  syn  maxfan=100000  */  ; 

output  [6:0]  fitO; 

wire  [6:0]  fitO; 

wire  [63:0]  val2; 

wire  [63:0]  TT; 


mapn6d2  A0(val0,  vail,  val2,  CLK); 
trans  tri  B0(val2,  TT,  CLK)  ; 
fit6n_ fO (TT,  CLK,  fitO); 


endmodule 


A.2  CODE  TO  COMPUTE  NONLINEARITY  FOR  N=4  AND  N=5 


1.  nonlin.v 


This  code  can  be  substituted  for  nl6n.v  in  A.  1.6  and  the  other  files  can  be  slightly 
modified  and  used  to  run  this  program.  The  appropriate  mapper  must  also  be  added  to 
generate  functions  of  a  specific  degree.  An  example  mapper  is  shown  in  Appendix  A. 5. 
These  functions  must  then  be  converted  to  truth  tables  using  the  trans  tri  module  shown 


in  Appendix  A. 3  or  A.4. 


//******************************************************************* 
//  nonlin.v  -  Compares  a  test  function  to  all  affine  functions 
//  and  gives  the  nonlinearity  as  the  output. 

// 

//  Created:  November  20,  2008 

//  Last  Modified:  February  9,  2009 
//  Author:  Jennifer  Shafer 

//  Description:  nonlin  receives  a  function  as  a  truth  table,  then 
//  creates  a  string  of  the  distances  between  the  input  and  the 
//  affine  functions,  generates  the  correct  number  of  ones  count 
//  modules  and  sends  in  the  result  of  the  exor  operation  and 

//  returns  a  long  string  of  distances  between  the  test  function 

//  and  each  affine  function  the  string  is  sent  to  the  min  module 
//  to  determine  the  smallest  distance  in  the  string.  This 
//  distance  is  the  nonlinearity  of  the  test  function. 
//******************************************************************* 


II 

II 

II 

II 

II 

II 

II 

II 

II 

II 

II 

// 

// 

// 

// 

// 


module  nonlin (TT,  NL,  CLK) ; 

//  Define  inputs,  outputs,  parameters,  registers,  and  wires 

parameter  n=4; 

parameter  N=2**n; 

parameter  NN=2**(n+l); 

input  [N-l : 0] TT; 

input  CLK; 

output  [7 : 0 ] NL; 

reg  [NN*N-1 : 0] EXOR; 
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reg  [NN*N-1 : 0] EXOR_REG; 

wire  [ (NN* (n+1 ) ) -1 : 0 ] MIN_IN; 

reg  [N-l : 0] IN_REG; 

reg  [7:0]  NL; 

wire  [7:0]  NL_REG; 

wire  [N-l : 0] TT; 

reg  [ (NN* (n+1) ) -1 : 0]MIN_REG; 

integer  i,j; 

//Created  a  loop  to  clock  different  registers  so  timing  is  correct 

always  @  (posedge  CLK) 

begin 

IN_REG<=TT; 

MIN_REG<=MIN_IN; 

EX0R_REG<=EX0R; 

NL<=NL_REG; 

end 

//  Enumerate  affine  functions  and  EXOR  with  test  function 

always  @  (*) 

begin 

for  (i  =0;  i<NN;  i=i+l) 
begin 

for  (j  =0;  j<N;  j=j+l) 
begin 

EXOR  [  i*N+ j  ]  <=IN_REG[ j ]  A(A(i&((j«l)+l))); 

end 

end 

end 

//  Generate  the  correct  number  of  instantiations  of  Ones_count 
//  Produce  a  long  string  of  ones  count  values  to  be  sent  into  MIN 
generate 

begin:  CountOnes 
genvar  p; 

for(p=0;  p<NN;  p=p+l) 
begin:  Ini 

Ones_Count  10  (  . TT (EX0R_REG [ (p*N+ (N-l ) ) - : N] ) , 

.Count (MIN_IN [ ( ( (n+1) *p) +n) -  :  (n+1) ] ) ) ; 

end 

end 

endgenerate 

//  Call  min  to  find  the  minimum  of  the  distances 

//  NL  REG  is  the  minimum  value  of  the  ones  count  values  and  the 

//  nonlinearity  of  the  input  function 

min  A0  (  . IN (MIN_REG) ,  . OUT (NL_REG) ,  . CLK (CLK) ) ; 

endmodule 

module  min ( IN, OUT, CLK) ; 

//*******************************************************************// 
//  min.v  -  Compares  several  n+l-bit  binary  values  and  delivers  the  // 
//  smaller  one  to  the  output.  // 


//  Created:  October  7,  2007 

//  Last  Modified:  November  20,  2008 

//  Author:  Jon  T.  Butler 

//  Modified  by:  Jennifer  Shafer 

//  Inputs:  IN-  string  of  all  values  to  compare 


// 
// 
// 
// 
// 

//  Outputs:  OUT-  the  minimum  value  // 

//*******************************************************************// 


parameter  inputs=4;  //  Indicates  number  of  variables  in  the  function 
parameter  n  =  inputs  +1;  //  Indicates  number  of  bits  in  the  input  to 
comparator 

parameter  affine=  2**n;  //Indicates  number  of  affine  functions  created 
parameter  length=n*af f ine;  //  Indicates  length  of  input  vector 
input  [length-1 :0]  IN;  //  Input  is  length  of  all  inputs  strung 
together 

reg  [length-1 :0]  curr  IN [affine : 0] ;  //Register  in  which  to  build  a 

'tree  of  comparators' 

output  [7:0]  OUT; 
input  CLK; 

integer  i,j;  //for  for  loops 
always  @ (posedge  CLK) 
begin 

curr  IN[0]  <=  IN;  //Take  in  the  whole  input  as  the  first 

level  of  the  tree 

for(j=l;  j<=n;  j=j+l)  //  Enumerate  a  level  in  the 

comparison  tree. 


begin 

for(i=0;  i<2** (inputs+1- j ) ;  i=i+l)  //Enumerate  a 

position  in  the  current  level, 
begin:  increment  //Compare  to  values  and  store  the 

min  in  the  next  higher  level. 
if(curr  IN[j-l][((2*i  +  2 ) *n-l ) - : n]  <  curr  IN  [  j  - 

1]  [ ( (2*i  +  1) *n-l) -:n] ) 

curr  IN [ j ]  [ (  ( i  +  1 ) *n-l ) - : n]  <=  curr  IN[j-l][((2*i  + 

2)  *n-l ) - :n] ; 

else  curr  IN  [  j ]  [  (  ( i  +  1 ) *n-l ) - : n]  <=  curr  IN  [  j - 
1] [ ( (2*i  +  1) *n-l) -:n] ; 

end  //end  inner  for  loop 
end  //end  outer  for  loop 
end  //end  always  statement 

assign  OUT  =  curr  IN[n] [(n-l)-:n];  //Out  is  the  final  value  in 

the  tree 
endmodule 


module  Ones_Count  (TT,  Count) ; 


//  Ones  Count. v  -  A  program  to  count  the  Is  in  an  input  // 

//  // 

//  Created:  August  18,  2007  // 

//  Last  Modified:  October  27,  2008  // 

//  Author:  Jon  T.  Butler  // 

//  Modified  by:  Jennifer  Shafer  // 

//  Inputs:  TT  n-variable  Truth  Table  2An  bits  // 

//  Outputs:  Count  Number  of  Is-  n+1  bits  // 

//  // 
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parameter  n=4; 
parameter  B=2**n; 
input [B-l : 0 ]  TT; 
output [n:0]  Count; 
reg[n:0]  Count; 


always  @ (TT) 

begin :  CHECK_n 
case (n) 

2 

3 

4 


Count2 (TT  [7  :  4] )  + 
5: 

Count 2  (TT  [23 : 20] ) 
+Count2  (TT  [11 : 8] ) 


end 


//  case  statement  for  n=2  through  n=6 
Count  =  Count2 (TT) ; 

Count  =  Count2 (TT [7 : 4] )  +  Count2 (TT [3 : 0] ) ; 

Count  =  Count2 (TT [ 15 : 12 ]  )  +Count2 (TT [11 : 8] )  + 
Count2 (TT [3 : 0] )  ; 

Count  =  Count2 (TT [31 : 28] )  +Count2 (TT [27 : 24 ] )  + 

+  Count2 (TT [19: 16] )  +  Count2  (TT  [  15  : 12 ] ) 

+  Count2 (TT [7 : 4] )  +  Count2 (TT  [3 : 0] ) ; 
default  Count  =  Count2 (TT) ; 
endcase 


// -  The  Is  count  function  -  Count2  for  2-variable  functions  - // 

function  [2:0]  Count2; 
input  [3:0]  AA; 
begin:  f2 

Count2 [ 0 ] =AA [ 3 ] A AA [ 2 ] AAA [ 1 ] A AA [  0  ]  ; 

Count 2 [1] = (AA[3] &AA[2] |AA[3]&AA[1] |AA[3]&AA[0] |AA[2]&AA[1] |AA[2]&AA[0] 
AA [ 1 ] &AA [ 0 ] ) (AA [ 3 ] &AA[2] &AA [ 1 ] &AA[0] ) ; 

Count2  [  2 ] =AA [ 3 ] &AA [ 2 ] &AA [ 1 ] &AA [  0  ]  ; 

end 

endfunction 

endmodule 


A.3  FULL  TRANSEUNT  TRIANGLE  VERILOG  CODE 


This  code  executes  every  Exclusive-Or  operation  in  the  triangle  even  though 


some  are  redundant.  This  module  will  compile  up  to  n=8. 


II 

II 

II 

II 

II 

II 

II 

II 

II 

II 

II 

II 


•k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k 

trans  tri.v  -  A  program  to  implement  the  transeunt  triangle  of 
an  n-variable  function. 


Created:  November  23,  2008 

Last  Modified:  January  5,  2009 
Author:  Jon  T.  Butler 

Description:  This  module  uses  a  2-D  array  to  form  the  triangle 

The  results  of  each  EXOR  operation  are  stored  in  the  next 
higher  row  in  the  array.  The  top  row  of  the  array  upon 
completion  of  all  operations  becomes  the  output  of  the  module. 

•k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k 


II 

II 

II 

II 

II 

II 

II 

II 

II 

II 

II 

II 


module  trans  tri(IN,  OUT,  CLK)  ; 

parameter  n  =  6;  //  Number  of  variables. 
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localparam  N  =  2**n;  //  Number  of  inputs  and  outputs, 

output  [N-1:0]  OUT;  //  OUT  is  the  ANF  of  the  input 

function . 

input  [N-1:0]  IN;  //  IN  is  the  specified  truth  table  of  the 

//  input  function. 

reg  [N— 1:0]  EXOR  array  [N— 1:0] ;  //The  array  in  which  the  transeunt 

//  triangle  is  embedded. 

input  CLK; 
integer  i ,  j  ; 

always  @ (posedge  CLK) 
begin 

EXOR  array [0]  =  IN;  //Set  left  column  of  EXOR  array  to  IN. 
for(i=l;  i<N;  i=i+l)  //Enumerate  a  level  in  the  transeunt 

/ / triangle . 

begin 

for(j=0;  j<N;  j=j+l)  //Enumerate  a  position  in  the  current  level, 
begin:  level 

if ( j  <=  i— 1 )  EXOR_array [ i ] [j]  =  EXOR_array [ i-1 ] [j]; 

else  EXOR^array [ i ] [j]  =  EXOR  array [i-1] [j]  A  EXOR  array [i- 

i]  [ j — l ]  ; 

end 

end 

end 

assign  OUT  =  EXOR  array[N-l]; 
endmodule 

A.4  REDUCED  TRANSEUNT  TRIANGLE  VERILOG  CODE 

This  code  reduces  the  number  of  operations  greatly.  The  code  can  work  for  n 
higher  than  6  with  the  addition  a  new  module  for  each  additional  n  that  calls  the  previous 


module  twice.  It  has  been  tested  to  work  to  at  least  n=9. 
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0UT2<=EX0R_array [2] [0] ; 
OUTl<=EXOR_array [1] [0] ; 
OUTA<=EXOR_array [ 0 ] [ 0 ] ; 

OUTO<=OUTA; 

end 

assign  OUT  =  {OUT3,  OUT2,  OUT1,  OUTO}; 
endmodule 

module  n4tri(IN0,  OUTO,  CLK)  ; 

output  [15:0]  OUTO; 

input  [15:0]  INO; 

input  CLK; 

wire  [15:0]  TT0; 

reg  [3:0]  TT1 ; 

reg  [7:0]  TT2 ; 

reg  [3:0]  TT3 ; 

reg  [3:0]  TTA; 

wire  [3:0]  ANF0; 

wire  [3:0]  ANF1; 

wire  [3:0]  ANF2; 

wire  [3:0]  ANF3; 

integer  i,  k,  j; 

assign  TT0=IN0; 
always  @ (posedge  CLK) 
begin 

for(i=0;  i<4;  i=i+l) 
begin 

TT1 [i] =TT0 [i] ATT0 [i+4] ; 

end 

for ( j  =0 ;  j  <8 ;  j=j+l) 
begin 

TT2 [ j ] =TT0 [ j ] ATT0 [ j+8] ; 

end 

for ( k=0 ;  k<4;  k=k+l) 
begin 

TT3 [k] =TT2 [k] ATT2 [k+4] ; 

end 

TTA<=TT0  [3:0] ; 

end 

trans_tri  A0 ( . IN (TTA) ,  .OUT(ANFO),  . CLK (CLK) ) ; 
trans_tri  A1 ( . IN (TT1) ,  .OUT(ANFl),  . CLK (CLK) ) ; 
trans_tri  A2 ( . IN (TT2 [3 : 0] ) ,  .OUT(ANF2),  . CLK (CLK) ) 
trans_tri  A3 ( . IN (TT3) ,  .OUT(ANF3),  . CLK (CLK) ) ; 

assign  OUTO={ANF3,  ANF2 ,  ANF1 ,  ANF0}; 

endmodule 

module  n5tri(IN0,  OUTO,  CLK); 


output  [31:0]  OUTO; 
input  [31:0]  INO; 
input  CLK; 
reg  [15:0]  TTO ; 
reg  [15:0]  TT1 ; 

wire  [15:0]  ANFO; 
wire  [15:0]  ANF1; 
integer  i; 

n4tri  B0 ( . INO (TTO) ,  .OUTO(ANFO),  . CLK (CLK) ) 
n4tri  B1 ( . INO (TT1) ,  .OUTO(ANFl),  . CLK (CLK) ) 
always  @ (posedge  CLK) 
begin 

for(i=0;  i<16;  i=i+l) 
begin 

TT1 [i] =IN0 [i] A INO [i+16] ; 

end 

TT0<=IN0 [15:0] ; 

end 

assign  OUTO={ANF1,  ANFO}; 
endmodule 

module  n6tri(IN0,  OUTO,  CLK); 

output  [63:0]  OUTO; 

input  [63:0]  INO; 

input  CLK; 

reg  [31:0]  TTO; 

reg  [31:0]  TT1 ; 

wire  [31:0]  ANFO; 
wire  [31:0]  ANF1; 
reg  [31:0]  ANF2 ; 
integer  i; 

n5tri  BO ( . INO (TTO) ,  .OUTO(ANFO),  . CLK (CLK) ) 
n5tri  B1 ( . INO (TT1 ) ,  .OUTO(ANFl),  . CLK (CLK) ) 
always  @ (posedge  CLK) 
begin 

for(i=0;  i<32;  i=i+l) 
begin 

TT1 [ i ] =IN0 [i] A INO [i+32] ; 

end 

TT0<=IN0 [31:0] ; 

end 

assign  OUTO={ANF1,  ANFO}; 

endmodule 


A.5  TWO  EXAMPLE  MAPPER  MODULES 


20 

The  first  module  uses  a  counter  from  1  to  2  -1  to  enumerate  all  homogeneous 

functions  of  degree  2  on  6  variables.  This  function  will  be  in  ANF  and  need  to  be 

converted  to  a  truth  table  using  the  trans  tri  module  in  order  to  be  tested  for  nonlinearity. 

The  second  module  below  uses  two  counters  to  enumerate  all  6-variable  functions  of 

highest  degree  2.  The  first  is  the  same  counter  used  for  homogeneous  functions,  and  the 

second  counter  enumerates  all  functions  with  tenns  lower  than  degree  2.  The  two 

counters  can  be  sent  to  the  module  using  nested  for  loops.  The  call  in  the  subroutine 

would  look  like  the  following: 

for  (m=l;  m<32768;  m++) { 

for  (j=0;  j<128;  j  ++) 

{ 

i0=m; 

il=j  ; 

mapn6d2 (iO,  il,  &o0); 

} 

} 

These  modules  can  also  be  called  from  the  nonlinearity  module  so  the  program 
only  uses  one  macro  call.  The  module  mapn6d2h  creates  all  homogeneous  functions  of 
degree  2  for  n=6.  The  second  module  listed  creates  all  functions  of  degree  2  for  n=6. 
These  functions  will  include  tenns  of  degree  zero  and  one. 

module  nl6n (counterl ,  counter2,  CLK,  NL)  ; 
input  [19:0]  counterl; 
input  [6:0]  counter2; 

input  CLK  /*  synthesis  syn  noclockbuf=l  syn  maxfan=100000  */  ; 

output  [6:0]  NL; 

wire  [6:0]  NL; 

wire  [63:0]  ANF; 

wire  [63:0]  TT; 

mapn6d2  A0 (counterl,  counter2,  ANF,  CLK) ; 
trans^tri  B0 (ANF,  TT,  CLK); 
f ind_NL  CO (TT,  CLK,  NL)  ; 

endmodule 


module  mapn6d2h (COUNTER,  ANF);  //homogeneous 

input  [19:0]  COUNTER;  //counter  should  be  1  through (2 A20) -1  to  ensure 
at  least  one  term  of  deg  2  is  included 
output [63:0]  ANF; 
wire  [19:0]  COUNTER; 

always  @ (posedge  CLK) 
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begin 


ANF [63] 

<= 

ANF [ 62 ] 

<= 

ANF [61] 

<= 

ANF [60] 

<= 

ANF [59] 

<= 

ANF [58] 

<= 

ANF [57] 

<= 

ANF [56] 

<= 

ANF [55] 

<= 

ANF [54] 

<= 

ANF [53] 

<= 

ANF [52] 

<= 

ANF [51] 

<= 

ANF [50] 

<= 

ANF [49] 

<= 

ANF [48] 

<= 

ANF [47] 

<= 

ANF [46] 

<= 

ANF [45] 

<= 

ANF [44] 

<= 

ANF [43] 

<= 

ANF [42] 

<= 

ANF [41] 

<= 

ANF [40] 

<= 

ANF [39] 

<= 

ANF [38] 

<= 

ANF [37] 

<= 

ANF [36] 

<= 

ANF [35] 

<= 

ANF [34] 

<= 

ANF [33] 

<= 

ANF [32] 

<= 

ANF [31] 

<= 

ANF [30] 

<= 

ANF [29] 

<= 

ANF [28] 

<= 

ANF [27] 

<= 

ANF [26] 

<= 

ANF [25] 

<= 

ANF [24] 

<= 

ANF [23] 

<= 

ANF [22] 

<= 

ANF [21] 

<= 

ANF [20] 

<= 

ANF [19] 

<= 

ANF [18] 

<= 

ANF [17] 

<= 

ANF [16] 

<= 

ANF [15] 

<= 

ANF [14] 

<= 

ANF [13] 

<= 

ANF [12] 

<= 

ANF [11] 

<= 

ANF [10] 

<= 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1'bO; 

1'bO; 

COUNTER [14];  //2 
1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1'bO; 

1'bO; 

COUNTER [13];  //2 
1  'bO; 

1'bO; 

1'bO; 

COUNTER [12];  //2 
1'bO; 

COUNTER [11];  //2 
COUNTER [10];  //2 
1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

COUNTER [9];  //2 
1  'bO; 

1  'bO; 

1  'bO; 

COUNTER [8];  1/2 
1  'bO; 

COUNTER [7];  //2 
COUNTER [6];  //2 
1  'bO; 

1  'bO; 

1  'bO; 

1  'bO; 

COUNTER [5];  //2 
1  'bO; 

COUNTER [4];  //2 
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ANF [ 9 ] 

<= 

COUNTER [3];  //2 

ANF [ 8 ] 

<= 

1  'b0; 

ANF [ 7 ] 

<= 

1  'b0; 

ANF [6] 

<= 

COUNTER [2];  //2 

ANF [ 5 ] 

<= 

COUNTER [1];  //2 

ANF [ 4 ] 

<= 

1  'b0; 

ANF [ 3 ] 

<= 

COUNTER [0];  //2 

ANF [2] 

<= 

1  'b0; 

ANF [ 1 ] 

<= 

1  'b0; 

ANF [ 0 ] 

<= 

1  'b0; 

end 

endmodule 

//  [65545443 

5 

443433254434332 

4 

332322154 

4343324332 

3 

221433232213221 

2 

110] 

//  Above  is  the  degree 

of  each  term  listed  by  index 

module  mapn6d2 (COUNTER 

,  COUNTERl ,  ANF);  //all 

input  [19:0]  COUTNER; 

//counter  should  be  1  through (2' 

'20) -1  to  ensure 

at  least  one  term  of  deg  2  is  included 

input  [6:0] COUNTER1 

;  //counter  should  be  0  through 

(2/ 

'7)  -1 

output [63:0]  ANF; 

wire  [19:0]  COUNTER 

r 

always  @ (posedge  CLK) 

begin 

ANF [63] 

<= 

1  'b0; 

ANF [62] 

<= 

1  'b0; 

ANF [61] 

<= 

1  'b0; 

ANF  [60] 

<= 

1  'b0; 

ANF [59] 

<= 

1  'b0; 

ANF [58] 

<= 

1  'b0; 

ANF [57] 

<= 

1  'b0; 

ANF [56] 

<= 

1  'b0; 

ANF [55] 

<= 

1  'b0; 

ANF [54] 

<= 

1  'b0; 

ANF [53] 

<= 

1  'b0; 

ANF [52] 

<= 

1  'b0; 

ANF [51] 

<= 

1  'b0; 

ANF [50] 

<= 

1'bO; 

ANF [49] 

<= 

1'bO; 

ANF [48] 

<= 

COUNTER [14];  //2 

ANF [47] 

<= 

1  'b0; 

ANF [46] 

<= 

1  'b0; 

ANF  [45] 

<= 

1  'b0; 

ANF [44] 

<= 

1  'b0; 

ANF  [43] 

<= 

1  'b0; 

ANF [42] 

<= 

1  'b0; 

ANF [41] 

<= 

1  'b0; 

ANF [40] 

<= 

COUNTER [13] ;  //2 

ANF [39] 

<= 

1  'b0; 

ANF [38] 

<= 

1'bO; 

ANF [37] 

<= 

1'bO; 

ANF [36] 

<= 

COUNTER [12];  //2 

ANF  [35] 

<= 

1'bO; 

ANF [34] 

<= 

COUNTER [11];  //2 

ANF [33] 

<= 

COUNTER [10];  //2 
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ANF 

[32] 

<= 

^  COUNTERl [6] ;  //I 

ANF 

[31] 

<= 

^  1'bO; 

ANF 

[30] 

<= 

^  1'bO; 

ANF 

[29] 

<= 

^  1'bO; 

ANF 

[28] 

<= 

:  1'bO; 

ANF 

[27] 

<= 

:  1'bO; 

ANF 

[26] 

<= 

:  1'bO; 

ANF 

[25] 

<= 

:  1'bO; 

ANF 

[24] 

<= 

:  COUNTER 

[9];  / / 2 

ANF 

[23] 

<= 

:  1'bO; 

ANF 

[22] 

<= 

:  1'bO; 

ANF 

[21] 

<= 

:  1'bO; 

ANF 

[20] 

<= 

:  COUNTER 

[8];  / / 2 

ANF 

[19] 

<= 

:  1'bO; 

ANF 

[18] 

<= 

:  COUNTER 

[7];  / / 2 

ANF 

[17] 

<= 

:  COUNTER 

[6];  / / 2 

ANF 

[16] 

<= 

:  COUNTERl [5] ;  //I 

ANF 

[15] 

<= 

^  1'bO; 

ANF 

[14] 

<= 

^  1'bO; 

ANF 

[13] 

<= 

^  1'bO; 

ANF 

[12] 

<= 

^  COUNTER 

[5];  / / 2 

ANF 

[11] 

<= 

^  1'bO; 

ANF 

[10] 

<= 

^  COUNTER 

[4];  / / 2 

ANF 

[9] 

<= 

COUNTER [ 

3];  112 

ANF 

[8] 

<= 

COUNTERl 

[4];  //I 

ANF 

[7] 

<= 

1  'b0; 

ANF 

[6] 

<= 

COUNTER [ 

2];  112 

ANF 

[5] 

<= 

COUNTER [ 

1];  112 

ANF 

[4] 

<= 

COUNTERl 

[3];  111 

ANF 

[3] 

<= 

COUNTER [ 

0];  112 

ANF 

[2] 

<= 

COUNTERl 

[2];  //I 

ANF 

[1] 

<= 

COUNTERl 

[l];  //l 

ANF 

[0] 

<= 

COUNTERl 

[0];  //0 

end 

endmodule 

A.6  PARAMETERIZED  CODE  TO  GENERATE  FUNCTIONS  OF  DEGREE  D 
1.  Macrol.v 

Macro  l  creates  two  vectors  the  same  length  as  the  function.  Each  bit  in  the 
vectors  represents  the  degree  of  the  corresponding  tenn  in  the  function.  The  first  vector, 
deg  vecl  contains  a  1  if  the  corresponding  term  is  degree  d  and  a  0  otherwise.  The 
second  vector,  deg_vec2  contains  a  1  if  the  corresponding  term  is  degree  d  or  less  and  a  0 
otherwise.  The  output  lengthbuf  is  the  number  of  ones  in  deg  vecl.  Macro  l  and 
Macro_2  are  alternatives  to  the  mappers  in  Appendix  A. 5.  The  mappers  must  be  created 
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specifically  for  each  n  and  d.  Macro  l  receives  n  and  d  as  inputs.  Macrol  and 
Macro_2,  however,  are  too  complex  for  n>5. 

module  macro  1 (n,  d,  bufl,  buf2,  lengthbuf,  CLK); 

input  [3:0]  d; 

input  [3:0]  n; 

input  CLK; 

output  [31:0] bufl; 

output  [31:0]buf2; 

output  [31 : 0] lengthbuf ; 

reg  [31 : 0] deg_vecl; 

reg  [31 : 0] deg_vec2; 

reg  [31:0] bufl; 

reg  [31:0]buf2; 

integer  i; 

wire  [3:0]  n; 

wire  [3:0]  d; 

reg  [31:0]  length; 

reg  [31:0]  lengthbuf; 

reg  [2:0]  Count; 

reg  [63:0]  countl; 

reg  [3:0]  nbuf; 

reg  [3:0]  dbuf; 

//  This  section  creates  two  vectors  of  length  2**n  where 

//  deg  vecl:  each  index  indicates  if  the  degree  of  that  term  is  <=  to  d 

//  deg  vec2 :  each  index  indicates  if  the  degree  of  that  term  is  =  to  d 

always@ (posedge  CLK) 

begin 

nbuf<=n; 

dbuf<=d; 

bufl<=deg  vecl; 

buf2<=deg  vec2; 

lengthbuf <=length; 

end 

always@ ( * ) 
begin 

for  (i=0;  i<(2**nbuf);  i=i+l) 
begin 

Count  =  Count2 ( i [ 7 : 4 ] ) +Count2 ( i  [  3 : 0 ] ) ; 
if (Count<=dbuf )  //  Creates  vector  for  degree  only 

deg_vecl [ i ] =1 ' bl ; 
else  deg_vecl [ i ] =1 ' bO ; 

if (Count==dbuf )  //Creates  vector  for  homogeneous  functions 
deg_vec2 [ i ] =1 ' bl ; 
else  deg_vec2 [ i ] =1 ' bO ; 

end 

end 

always@ (bufl ) 
begin 

//Countl  adds  up  the  number  of  ones  in  DEG  VEC  so  you  know  how  many 
functions  you  need  to  enumerate. 
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countl=  Count2 (buf 1 [31 : 28] )  +  Count2 (buf 1 [27 : 24 ]  )  + 

Count2 (buf 1 [23 : 20] )  +  Count2 (buf 1 [ 19 : 16] ) +Count2 (buf 1 [ 15 : 12 ] )  + 

Count2 (buf 1 [ 11 : 8] )  +  Count2 (buf 1 [ 7 : 4 ] )  +  Count2 (buf 1 [ 3 : 0 ] ) ; 
length=2** count 1 ; 

end  //end  always@ (posedge  CLK) 

//This  function  counts  the  number  of  ones  in  the  input, 
function  [2:0]  Count2; 
input  [3:0]  AA; 
begin:  f2 

Count2 [ 0 ] =AA [ 3 ] A AA [ 2 ] AAA [ 1 ] A AA [  0  ]  ; 

Count  2  [1]  =  (AA[3]  &AA[2]  |AA[3]&AA[1]  |AA[3]&AA[0]  |AA[2]&AA[1]  |AA[2]&AA[0] 
AA  [  1  ]  &AA  [ 0 ]  )  (AA  [ 3 ]  &AA[2]  &AA  [  1  ]  &AA[0]  )  ; 

Count2 [ 2 ] =AA  [ 3 ] &AA [ 2 ] &AA [ 1 ] &AA  [  0 ] ; 

//  Count 2 [7:3]=5'b00000; 
end 

endfunction 

endmodule 


2.  Macro_2.v 

Macro_2  is  called  inside  a  for  loop  using  the  output  of  macrol,  the  index  of  the 
for  loop  and  a  place  in  dec  vecl  that  contains  a  one.  It  forms  a  new  Boolean  function 
with  degree  d  based  on  the  inputs.  The  for  loop  runs  lengthbuf/2  times  the  first  time  it  is 
called  and  cuts  the  length  in  half  each  time  the  for  loop  is  reinitiated  (when  a  new  place  in 
deg  vecl  contains  a  one.  The  output  of  macro_2  is  a  function  in  ANF  of  degree  d  and 
that  function  is  then  sent  to  the  transeunt  triangle  and  then  the  TT  result  is  sent  to  the 


nonlinearity  module. 


module  macro  2 (index,  deg  vecl,  deg  vec2,  i,  fbuf,  CLK); 

input 

[7:0]  index; 

input 

[31:0] deg  vecl; 

input 

[31:0] deg  vec2; 

input 

[15:0]  i; 

input 

CLK; 

output  [31:0]  fbuf; 

wire 

[7:0]  index; 

wire 

[31:0]  deg  vecl; 

wire 

[31:0]  deg  vec2; 

wire 

[15:0]  i; 

reg  [ 

31:0]  f; 

reg  [ 

31:0]  fbuf; 

reg  [ 

7:0]  indexbuf; 

reg  [ 

31:0] deg  veclbuf; 

reg  [ 

31:0]  deg  vec2buf; 

reg  [ 

15:0]  ibuf ; 
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integer  j , k; 
always@ (posedge  CLK) 
begin 

indexbuf<=index; 
deg  veclbuf<=deg  vecl; 
deg  vec2buf<=deg  vec2; 
ibuf<=i ; 
fbuf <=f ; 
end 

always@  ( * ) 
begin 

k=0  ; 

for  ( j  =3 1 ;  j  >=0 ;  j  = j  —1 ) 
begin 

if(indexbuf  == j  ) 

f[j]=l'bl;  //Ensure  at  least  one  term  of  degree  d  is  a  one 
else  if (deg_veclbuf [ j ]  ==l'bO) 

f[j]=l'bO;  //ensure  all  terms  of  degree  >  d  are  zero 
else  if ( (deg_vec2buf [ j ] ==1 ' bl )  &&  ( j >indexbuf ) ) 
f  [  j ] =1 ' bO ; 

else 

begin 

f [ j ] =i [ k ] ;  //Fill  in  other  bits  with  a  counter 
k=k+l ; 
end 

end 

end  //end  always 

endmodule 


3.  subr.mc 

/********************************************************************/ 

/*  */ 

/*  subr.mc  -  MAP  C  subroutine  to  produce  ANF  homogeneous  functions*/ 
/*  of  n  variables  of  degree  d.  */ 

/*  *  / 

/*  Author:  Jennifer  Shafer  */ 

/*  Created:  April  3,  2009  */ 

/*  Last  modified:  April  5,  2009  */ 

/*  */ 

/*  Description:  This  program  calls  two  macros  and  outputs  */ 

/*  ANF  form  of  functions  with  degree  d  given  n.  */ 

/*  *  / 

/********************************************************************/ 

#include  <libmap.h> 

void  subr  (int64_t  a[],  int64_t  b[],  int64_t  c[],  int64_t  *time,  int 
mapnum)  { 


//  Declare  OBM  banks  in  SRC-6 

OBM  BANK  A  (A,  int64  t,  2) 


OBM_BANK_B  (B,  int64_t,  15) 

OBM_BANK_C  (C,  int64_t,  2) 

OBM_BANK_D  (D,  int64_t,  100) 

int64_t  tO,  tl,  i,  j,  k,  m,  length; 

int64  t  n,  deg,  index,  check  val; 
int64  t  deg  vec2,  countl,  f,  NL; 
int64  t  idx,  sel,  max=0; 

intl6  t  Hist0[]=0;  //initialize  the  values  to  0 
intl6_t  Histl[]=0; 
intl6_t  Hist2[]=0; 
intl6_t  Hist3[]=0; 

//  Get  input  values  by  DMAing  FROM  the  CPU 

DMA^CPU  (CM20BM,  A,  MAP_OBM_stripe ( 1 , "A" ) ,  a,  1, 

2*sizeof (int64_t)  ,  0); 

wait  DMA  ( 0 ) ; 

//  n  and  d  are  passed  in  from  the  main  function 
n=  A [ 0 ] ; 
deg=  A [ 1 ] ; 

read_timer ( &t0 ) ; 

/////////////////////////////////////////////////////////////////////// 

//  Macrol  takes  n  and  d  as  inputs  and  outputs  the  following: 

//  deg  vecl:  2An  bits  long,  if  bit  is  a  one,  the  term  is  of 

//  degree  d  or  less  if  bit  is  a  zero,  the  term  is  of  degree 

//  greater  than  d 

//  deg  vec2 :  2An  bits  long,  if  bit  is  a  one,  the  term  is  of 

//  degree  d  if  bit  is  a  zero,  the  term  is  not  of  degree  d 

//  countl:  the  number  of  ones  in  deg  vecl 

//  count2 :  the  number  of  ones  in  deg  vec2 

/////////////////////////////////////////////////////////////////////// 
macro_l (n,  deg,  &deg_vec2,  Slength)  ; 
check  val=  deg  vec2; 

/////////////////////////////////////////////////////////////////////// 
//  For  each  term  of  degree  d,  call  macro  2  with  the  following  inputs 
and  outputs: 

//  Inputs: 

//  index:  the  next  index  of  deg  vec2  that  contains  a  ' 1 ' 

//  deg  vecl:  the  vector  with  ones  at  terms  <=  d 

//  i:  the  next  number  generated  from  the  counter  from  countl 

//  Outputs: 

//  f:  the  next  function  with  degree  d 

//  Upon  completion  of  this  loop,  all  functions  of  degree  d  will  be 
stored  in  B 

/////////////////////////////////////////////////////////////////////// 

k=0  ; 

for  (j  =  31;  j  >=  0;  j--) { 

if (check_val  &  0x80000000) { 
index= j ; 

length=length*0 . 5; 

for  (i=0;  iklength;  i++) { 

macro  2 (index,  deg  vec2,  i,  &f ) ; 
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macro_3 (f,  &NL)  ; 

D [k] =NL; 
k=k+l ; 

}//end  for 
}//end  if 

check  val=check  val<<l; 

}//end  for 
C [ 0 ] =k; 

#pragma  loop  noloop  dep  //used  for  histogram 
for(m=0;  m<k;  m++) { 
idx=D [m] ; 

if(idx>max)  max=idx;  //set  max  to  the  highest  NL 

found 

sel  =  m  &  3; 

if  (sel  ==  0)  HistO [idx] ++; 
if  (sel  ==  1)  Histl [idx] ++; 
if  (sel  ==  2)  Hist2 [idx] ++; 
if  (sel  ==  3)  Hist3 [idx] ++; 

}  //end  for 

C [ 1 ] =max ; 

for(m=0;  m<=max;  m++) {  //add  the  values  back  together  to  pass 
back  to  CPU 

B [m] =HistO [m] +  Histl [m] +  Hist2 [m] +Hist3 [m] ; 

}  //end  for 


read_timer ( &tl ) ; 

*time  =  (tl  -  tO) ; 

//  Return  functions  by  DMAing  TO  the  CPU 

DMA_CPU  (OBM2CM,  B,  MAP_OBM_stripe ( 1 , "B" )  ,  b,  1, 
15*sizeof (int64_t) ,  0) ; 
wait  DMA  ( 0 ) ; 

DMA_CPU  (OBM2CM,  C,  MAP_OBM_stripe  ( 1 , "C" )  ,  c,  1, 
2*sizeof (int64_t) ,  0); 
wait  DMA  ( 0 ) ; 

//  DMA_CPU  (OBM2CM,  D,  MAP_OBM_stripe ( 1 , "D" ) ,  d,  1, 
100*sizeof (int64_t) ,  0); 

/ /  wait  DMA  ( 0 ) ; 


Bj 


A.7  CODE  TO  GENERATE  AND  TES  T  RO  TATION  SYMMETRIC  AND 
DIHEDRAL  SYMMETRIC  FUNCTIONS 

1.  mapROTS.v 


// - 

//  mapROTS.v-  Takes  in  a  number  from  a  counter  and  outputs  a  rotation 
//  symmetric  function 
//  Created:  May  15,  2009 

//  Last  Modified:  May  15,  2009 
//  Author:  Jennifer  Shafer 

// - 

module  mapROTS (CLK,  TT,  ROTS_REG,  ANF); 

parameter  n=6; 

parameter  N=2**n; 

input  CLK; 

input  [15:0]TT; 

output  [N-l : 0] ROTS_REG; 

output  [N-l :0] ANF; 

wire  [N-l :0] ANF; 

reg  [N-1:0]  ROTS; 

reg  [N-l : 0] ROTS_REG; 

reg  [15 : 0] TT_reg; 

//Ensure  all  of  TT  goes  into  the  Test  rs  funcition  together 

always  @  (posedge  CLK) 

begin 

TT_reg<=TT; 

ROTS_REG<=ROTS ; 
end 

//get  a  ROTS  function  depending  on  n 
always  @ (TT_reg) 

begin:  sel  mod 
case (n) 

2:  ROTS  =  Test_rs2 (TT_reg) ; 

3:  ROTS  =  Test_rs3 (TT_reg) ; 

4:  ROTS  =  Test_rs4 (TT_reg) ; 

5:  ROTS  =  Test_rs5 (TT_reg) ; 

6:  ROTS=  Test_rs6 (TT_reg) ; 
default  ; 
endcase 

end 

trans_tri  10  ( . IN (ROTS_REG) ,  .OUT (ANF),  . CLK (CLK) ) ; 

//  The  rest  of  this  module  is  the  function  definitions  for  each  n 
//  The  assign  statements  were  generated  from  C-code 

function  [N-1:0]  Test  rs2; 

//  for  n=2 

input  [N-l:0]RSi;  / / 4  bits 

begin:  rs2 

Test  rs2[0]=  RSi[0]; 
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Test_rs2[l]=  RSi[l]; 
Test_rs2[2]=  RSi[l]; 
Test_rs2[3]=  RSi[2]; 
end 

endfunction 

function  [N-1:0]  Test_rs3; 
//  for  n=3 

input  [N-l:0]RSi;  / / 8  bits 
begin:  rs3 
Test_rs3[0]=  RSi[0]; 
Test_rs3[l]=  RSi[l]; 
Test_rs3[2]=  RSi[l]; 
Test_rs3[3]=  RSi[2]; 
Test_rs3[4]=  RSi[l]; 
Test_rs3[5]=  RSi[2]; 
Test_rs3[6]=  RSi[2]; 
Test_rs3[7]=  RSi[3]; 
end 

endfunction 

function  [ N— 1 : 0 ]  Test  rs4; 
//  for  n=4 

input  [N-l:0]RSi;  //16  bits 
begin:  rs4 

Test_rs4[0]=  RSi[0]; 
Test_rs4[l]=  RSi[l]; 
Test_rs4[2]=  RSi[l]; 
Test_rs4[3]=  RSi[2]; 
Test_rs4[4]=  RSi[l]; 
Test_rs4[5]=  RSi[3]; 
Test_rs4[6]=  RSi[2]; 
Test_rs4[7]=  RSi[4]; 
Test_rs4[8]=  RSi[l]; 
Test_rs4[9]=  RSi[2]; 
Test_rs4 [10] =  RSi[3]; 
Test_rs4 [ 1 1 ] =  RSi[4]; 
Test_rs4 [ 12 ] =  RSi[2]; 
Test_rs4 [13] =  RSi[4]; 
Test_rs4 [ 14 ] =  RSi[4]; 
Test_rs4 [15] =  RSi[5]; 
end 

endfunction 

function  [N-1:0]  Test_rs5; 
//  for  n=5 

input  [N-l:0]RSi;  //32  bits 
begin:  rs5 
Test_rs5[0]=  RSi[0]; 
Test_rs5[l]=  RSi[l]; 
Test_rs5[2]=  RSi[2]; 
Test_rs5[3]=  RSi[3]; 
Test_rs5[4]=  RSi[4]; 
Test_rs5[5]=  RSi[5]; 

Test  rs5[6]=  RSi[6]; 


Test_rs5[7]=  RSi[7]; 
Test_rs5[8]=  RSi[2]; 
Test_rs5[9]=  RSi[8]; 
Test_rs5 [10] =  RSi[9]; 
Test_rs5 [ 1 1 ] =  RSi[10]; 
Test_rs5 [ 12 ] =  RSi[6]; 
Test_rs5 [13] =  RSi[ll]; 
Test_rs5 [ 14 ] =  RSi[12]; 
Test_rs5 [15] =  RSi[13]; 
Test_rs5 [16] =  RSi[l]; 
Test_rs5 [ 17 ] =  RSi[14]; 
Test_rs5 [ 1 8 ] =  RSi[8]; 
Test_rs5 [19] =  RSi[15]; 
Test_rs5 [20] =  RSi[5]; 
Test_rs5 [21] =  RSi[16]; 
Test_rs5 [22 ] =  RSi[ll]; 
Test_rs5 [23] =  RSi[17]; 
Test_rs5 [24 ] =  RSi[3]; 
Test_rs5 [25] =  RSi[15]; 
Test_rs5 [26] =  RSi[10]; 
Test_rs5 [27 ] =  RSi[18]; 
Test_rs5 [28] =  RSi[7]; 
Test_rs5 [29] =  RSi[17]; 
Test_rs5 [ 30 ] =  RSi[13]; 
Test_rs5 [31] =  RSi[19]; 

end 

endfunction 

function  [N-1:0]  Test_rs6; 
//  for  n=6 

input  [N-l:0]RSi;  //64  bits 
begin:  rs6 
Test_rs[0]=  RSi[0]; 
Test_rs[l]=  RSi[l]; 
Test_rs[2]=  RSi[l]; 
Test_rs[3]=  RSi[2]; 
Test_rs[4]=  RSi[l]; 
Test_rs[5]=  RSi[3]; 
Test_rs[6]=  RSi[2]; 
Test_rs[7]=  RSi[4]; 
Test_rs[8]=  RSi[l]; 
Test_rs[9]=  RSi[5]; 
Test_rs[10]=  RSi[3]; 
Test_rs[ll]=  RSi[6]; 
Test_rs[12]=  RSi[2]; 
Test_rs[13]=  RSi[7]; 
Test_rs[14]=  RSi[4]; 
Test_rs[15]=  RSi[8]; 
Test_rs[16]=  RSi[l]; 
Test_rs[17]=  RSi[3]; 
Test_rs[18]=  RSi[5]; 
Test_rs[19]=  RSi[7]; 
Test_rs[20]=  RSi[3]; 

Test  rs[21]=  RSi[9]; 


_rs  [22 ] : 
"rs  [23]  = 
"rs  [24]; 
"rs  [25]: 
"rs  [26]: 
"rs  [27]: 
"rs  [28]: 
"rs  [29]: 
"rs [30]: 
"rs [31]: 
"rs [32]: 
"rs [33]: 
"rs [34]: 
"rs [35]: 
"rs  [36]: 
"rs  [37]: 
"rs [38]: 
"rs  [39]: 
"rs [40]: 
"rs [41]: 
"rs [42]: 
"rs [43]: 
_rs  [44]: 
"rs [45]: 
"rs  [46]: 
"rs  [47]: 
"rs  [48]: 
"rs  [49]: 
"rs [50]: 
"rs [51]: 
"rs [52]: 
'rs  [53]: 
_rs  [ 54 ] : 
"rs [55]: 
"rs  [56]: 
"rs  [57]: 
"rs [58]: 
"rs  [59]: 
"rs [60]: 
"rs [61]: 
"rs [62]: 
"rs [63]: 


RSi [6] ; 
RSi [10] ; 
RSi  [2]  ; 
RSi [6] ; 
RSi  [7]  ; 
RSi  [11]  ; 
RSi  [4]  ; 
RSi  [10] ; 
RSi  [8]  ; 
RSi  [12]  ; 
RSi  [1]  ; 
RSi  [2]  ; 
RSi  [3]  ; 
RSi  [4]  ; 
RSi  [5]  ; 
RSi [6] ; 
RSi  [7]  ; 
RSi  [8]  ; 
RSi  [3]  ; 
RSi  [7]  ; 
RSi [9] ; 
RSi  [10]  ; 
RSi [6] ; 
RSi  [11]  ; 
RSi  [10] ; 
RSi  [12]  ; 
RSi  [2]  ; 
RSi  [4]  ; 
RSi [6] ; 
RSi  [8] ; 
RSi  [7]  ; 
RSi  [10] ; 
RSi  [11]  ; 
RSi  [12]  ; 
RSi  [4]  ; 
RSi  [8] ; 
RSi  [10] ; 
RSi  [12]  ; 
RSi  [8]  ; 
RSi  [12]  ; 
RSi  [12]  ; 
RSi  [13]  ; 


end 

endfunction 

endmodule 


module  trans  tri(IN,  OUT,  CLK) ; 
parameter  n  =  6; 
localparam  N  =  2**n; 
output  [N-1:0]  OUT; 
function . 


//  Number  of  variables. 

//  Number  of  inputs  and  outputs 
//  OUT  is  the  ANF  of  the  input 


//  IN  is  the  specified  truth  table  of 


input  [N-1:0]  IN; 
the  input  function. 

reg  [N-1:0]  EXOR  array  [N-1:0];  //The  array  in  which  the 
transeunt  tringle  is 

input  CLK;  //  embedded, 

integer  i ,  j  ; 


always  @ (posedge  CLK) 
begin 

EXOR  array [0]  =  IN;  //Set  left  column  of 

EXOR  array  to  IN. 

for(i=l;  i<N;  i=i+l)  //Enumerate  a  level  in  the 

transeunt  triangle. 

begin 

for(j=0;  j<N;  j=j+l)  //Enumerate  a  position  in  the 

current  level. 

begin:  level 

if (j  <=  i— 1 )  EXOR  array [i]  [j]  =  EXOR  array  ti¬ 
ll  [j]; 

else  EXOR_array [ i ] [j]  =  EXOR_array [ i-1 ] [j] 

EXOR_ar r ay [ i - 1 ]  [ j  — 1 ] ; 

end 

end 

end 


assign  OUT  =  EXOR  array [N-l]; 


endmodule 


2.  subr.mc 


/*  */ 

/*  subr.mc  -  Subroutine  to  produce  degree,  NL  and  homog  of  a  */ 

/*  function.  */ 

/*  *  / 

/*  Author:  Jennifer  Shafer  */ 

/*  Created:  May  2009  */ 

/*  Last  modified:  July  7,  2009  */ 

/*  *  / 

/*  Description:  This  program  calls  two  macros  and  finds  homog  */ 

/*  and  degree  using  C-code.  */ 

/*  *  / 

/*********************************************************************/ 

#include  <libmap.h> 

#define  length  16384  //number  of  ROTS  functions  for  n=6 

void  subr  (int64_t  a[],  int64_t  b[],  int64_t  c[],  int64_t  d[],  int64_t 
e[],  int64  t  f[],  int64  t  *time,  int  mapnum)  { 


//  Declare  two  OBM  banks  in  SRC-6,  one  to  store  two  number  concatenated 
//  together  and  the  other  to  store  the  minimum  of  the  two. 
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OBM  BANK  A  (A,  int64  t,  length)  //  Stores  counter  to  mapper 

OBM  BANK  B  (B,  int64  t,  length)  //  Stores  degree 

OBM_BANK~C  (C,  int64_t,  length)  //  Stores  NL 

OBM  BANK  D  (D,  int64  t,  length)  //  Stores  homogeneity 

OBM  BANK  E  (E,  int64  t,  length)  //  Store  balance 

OBM  BANK  F  (F,  int64  t,  length)  //  Stores  truth  table 

int  tO,  tl,  i,  j,  k,  flag,  cnt; 
int  checkl,  check2,  ones  count,  count; 
uint64  t  test,  bithigh; 
uint64_t  outl,  out2; 

DMA_CPU  (CM20BM,  A,  MAP_OBM_stripe ( 1 , "A" ) ,  a,  1, 
length*sizeof (int64_t) ,  0); 
wait  DMA  ( 0 ) ; 
read_timer ( &t0 ) ; 

for  (i  =0;  i  <  length;  i++) { 

E [ i ] =0 ; 

mapROTS  ( A [ i ] ,  &outl,  &out2);  //Returns  the  ANF  and  TT  of 

function 

F[i]=outl;  //TT  form 
popcount_64 (outl ,  Scount) ; 
if (count==32 )  E[i]=l; 
my_nl6n (outl ,  &C[i]);  //Returns  NL 

//  D[i]  is  1  for  homogeneous  0  for  non-homogeneous 
//  B[i]  is  the  degree  of  the  function 
D [ i ] —1 ; 

B  [  i  ]  =0 ; 
checkl=0 ; 
check2=0  ; 
f lag=l ; 
ones_count=0 ; 
test=out2 ; 

//for  loop  checks  each  bit  in  the  function  one  at  a  time 
for  (j  =  0;  j  <=  63;  j++) { 
bithigh=  (0x0000000000000001  &  test) ; 
if (bithigh  !=  0){ 
cnt= j ; 

popcount_32 (cnt,  &ones_count) ;  //finds  the  degree  of 

the  bit 

//  Set  first  degree  in  a  degree  tracker 
if (flag) { 

checkl=ones  count; 
f lag=0 ; 

} 

//  If  any  degrees  are  different,  the  function  is  not 

homogeneous 

if (ones_count  !=  checkl)  D[i]=0; 

/ /  Sets  degree  to  highest  degree  found  in  function 
if (ones  count  >  check2) { 

B [i] =ones_count; 
check2=ones  count; 

} 

} 
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test=test»l  ; 

} 

} 

read_timer ( &tl )  ; 

*time  =  (tl  -  tO) ; 

//  Return  16  Min  values  by  DMAing  TO  the  CPU 
DMAJ1PU  (OBM2CM,  B,  MAP_OBM_stripe ( 1 , "B" ) ,  b,  1, 
length*sizeof (int64_t)  ,  0); 
wait  DMA  ( 0 ) ; 

DMA_CPU  (OBM2CM,  C,  MAP_OBM_stripe ( 1 , "C" ) ,  c,  1, 
length*sizeof (int64_t) ,  0); 
wait  DMA  ( 0 ) ; 

DMA_CPU  (OBM2CM,  D,  MAP_OBM_stripe (1 , "D" ) ,  d,  1, 
length*sizeof (int64_t) ,  0); 
wait  DMA  ( 0 ) ; 

DMA^CPU  (OBM2CM,  E,  MAP_OBM_stripe ( 1 , "E" ) ,  e,  1, 
length*sizeof (int64_t)  ,  0); 
wait  DMA  ( 0 ) ; 

DMA_CPU  (OBM2CM,  F,  MAP_OBM_stripe (1 , "F" ) ,  f,  1, 
length*sizeof (int64_t)  ,  0); 
wait  DMA  ( 0 ) ; 

} 


A.8  CODE  TO  GENERATE  ALL  AFFINE  FUNCTIONS 

This  module  generates  every  affine  function  for  n=8.  The  code  can  be  modified 
for  other  n  by  changing  the  parameter  N  in  the  module  to  2"  and  the  number  of  outputs 
should  be  T/64.  This  code  can  also  be  used  to  generate  half  of  the  affine  functions. 
Every  affine  function  has  a  complement  that  is  also  an  affine  function.  To  do  this,  a 
change  is  made  in  the  subroutine  shown  in  the  next  section. 


1.  genaff.v 


module  genaff(IN,  OUT1,  OUT2,  OUT3,  OUT4,  CLK) ; 

input  [31:0] IN; 

output  [63:0]OUT1; 

output  [63:0]OUT2; 

output  [63:0]OUT3; 

output  [63:0]OUT4; 

input  CLK; 

reg  [255 : 0] TEMPOUT; 

reg  [ 63:0] OUT 1; 

reg  [63:0]OUT2; 

reg  [63:0]OUT3; 

reg  [63:0]OUT4; 

wire  [31:0] IN; 

integer  j ; 

parameter  N=256; 
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always  @  (*) 

begin 

for  (j  =0;  j<N;  j  = j  + 1 ) 
begin 

TEMPOUT  [  j  ]  <=  (  A  ( IN&  (  ( j«l)  +1)  )  )  ; 

end 

end 

always  @  (posedge  CLK) 
begin 

0UT1<=  TEMPOUT [255 : 192] ; 

OUT2<=  TEMPOUT [191 : 128] ; 

OUT3<=  TEMPOUT [127 : 64] ; 

OUT4<=  TEMPOUT [63 : 0] ; 

end 


endmodule 


2.  subr.mc 

The  subroutine  calls  the  module  2n+1  times  for  all  the  affine  functions  or  2"  times 
for  half  the  affine  functions.  Length  is  defined  using  a  define  statement  and  should  be 
2"+1 .  If  generating  only  half  the  affine  functions,  the  subroutine  must  be  modified  by 
changing  the  subroutine  for  loop  to  increment  the  index  by  two  instead  of  one.  This  is 
noted  in  the  subroutine  code.  Also,  the  length  of  vector  A  decreases  by  half. 

#include  <libmap.h> 

#define  length  512 

void  subr  (int64  t  a[],  int64  t  *time,  int  mapnum)  { 

//  Declare  OBM  banks  in  SRC-6 

OBM_BANK_A  (A,  int64_t,  4*length) 
int64_t  tO,  tl; 
int  i; 

read_timer ( &t0 ) ; 

//  To  get  only  half  the  affine  functions,  change  i++  to  i=i+2  and 
change  each  i  inside  the  brackets  of  A[]  in  the  macro  call  to  i/2 
for  (i=0;  i<length;  i++) { 

genaff(i,  &A[i*4],  &A[i*4+l],  &A[i*4+2],  &A[i*4+3]); 

} 

read_timer ( &tl ) ; 

*time  =  (tl  -  tO) ; 

//  Return  functions  by  DMAing  TO  the  CPU 
DMA^CPU  (OBM2CM,  A,  MAP_OBM_stripe ( 1 , "A" ) ,  a,  1, 

4*length*sizeof (int64  t) ,  0); 
wait  DMA  ( 0 ) ; 

} 
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3.  main.c 


The  main  file  composes  the  output  to  be  printed  in  the  form  of  Verilog  assign 
statements.  Because  printing  a  hexadecimal  number  leaves  off  leading  zeros,  they  must 
be  added  in  by  hand.  The  spaces  added  into  the  output  make  it  easy  to  determine  where 
zeros  were  left  off.  They  are  only  seen  in  functions  with  the  pattern  using  0x0  and  OxF. 
The  spaces  must  then  be  removed  to  prevent  errors  when  inserting  the  code  into  the  final 
module.  If  printing  only  half  the  affine  functions,  the  parameter  length  would  be  2". 

#include  <map.h> 

#include  <stdlib.h> 

#define  length  512  //2A(n+l) 

//Initialization  of  subroutine 

void  subr  (  int64  t*,  int64  t*,  int  ); 

int  main  (int  argc,  char  *argv[])  { 

int  mapnum  =0;  //  Indicates  which  map  to  use 

int64  t  time  elk;  //Reads  internal  clock 

int64  t  *a;  //  Input  variables  for  the  subroutine  call 

int  i; 

//  Allocate  array  of  TT  values,  and  array  of  ANF  values 

a  =  (int64  t  *)  malloc  (4*length*  sizeof  (int64  t)  )  ; 

map_allocate  (1); 

//  Call  subroutine  subr. me  on  the  MAP. 
subr  (a,  &time  elk,  mapnum) ; 

//  Print  out  the  number  of  clocks. 

printf  ("%lld  clocks\n",  time  elk); 

for  (i=0;  iklength;  i++) { 

printf ( "assign  afns [ % i ] =256 ' h%llx  %llx  %llx  %llx;  \n",  i,  a[4*i], 
a [ 4*i+l ] ,  a [4*i+2] ,  a[4*i+3]); 

} 

map_free  (1); 
exit ( 0 )  ; 

} //int  main  (int  argc,  char  *argv[] ) 
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APPENDIX  B.  C-CODE 


B.l  C-CODE  TO  GENERATE  ROTATION  SYMMETRIC  MAPPER 


/*********************************************************************/ 

/*  createROTS.c  -  Code  to  generate  assign  statements  for  Verilog  */ 
/*  Author:  Jennifer  Shafer  */ 
/*  Created:  October  31,  2008  */ 
/*  Last  modified:  February  23,  2009  */ 
/*  Description:  Takes  a  series  of  binary  values  and  determines  */ 
/*  which  are  rotational  symmetric  and  prints  assign  statements*/ 
/*  for  Verilog  code  to  create  a  correct  truth  table.  */ 
/*********************************************************************/ 


#include  <stdio.h> 

#include  <stdlib.h> 

#include  <stdint.h> 

main  ( ) 

{ 

int  n=6;  //number  of  variables 

int  i,j;  //used  in  for  loops 

int  length=64;  //length  of  truth  table,  2An 

int  counter=0; 

int  type; 

int  RSi [length];  //vector  to  store  number  of  ROTS 

//  Assign  all  values  in  RSi  vector  to  255 
for(i=0;  i<length;  i++)  RSi [i] =255; 

//  For  each  value  in  the  list  0  ->  7  if  the  RSi  value  is  255,  give  the 
new  ROTS  a  number  and  find  the  rest  of  them 
for  (i=0;  i<length;  ill) 

{  if (RSi [i] ==255) { 

//  For  each  time  you  need  to  shift  find  the  new  value  in  the  list  and 
number  it 

for  (  j  =0 ;  j<n;  j++)  { 

type  =  shift (i,  j,  n) ;  //Call  the  shift  function  and  return 
the  shifted  value 

RSi [type] =counter; 

} 

counterll;  //  If  a  new  number  has  been  assign  to  a  series  of 
ROTS,  increment  the  number 
} 

} 

//  Print  the  assign  statements  to  use  in  Verilog  code 
for  (i=0;  i<length;  i++) { 

printf ("\nTest_rs6 [%i] =  RSi[%i];",  i,  RSi[i]); 

} 

} 

//  k  is  number  to  shift  (0  through  length),  m  is  number  of  times  to 
shift  (0  through  n-1),  n  is  number  of  variables 
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int  shift (int  k,  int  m,  int  n) 

{ 

unsigned  int  lower,  upper,  new;  //8-bit  values 

lower=k«m; 

upper=lower»n; 

new=lower  |  upper;  //To  create  rotational  shift 
new=new« (32-n) ;  //  To  clear  number  in  bits  above  n 
new=new»  (32-n)  ; 

return (new) ;  //Return  the  rotationally  shifted  number 

} 


B.2  C-CODE  TO  GENERATE  DIHEDRAL  SYMMETRIC  MAPPER 


/*  createDihedral . c  -  Generates  the  assign  statements  for  Verilog  */ 
/*  *  / 
/*  Author:  Jennifer  L.  Shafer  */ 
/*  Created:  April  6,  2009  */ 
/*  Last  modified:  June  2,  2009  */ 
/*  Description:  Takes  a  series  of  binary  values  and  determines  */ 
/*  which  are  dihedral  symmetric  and  prints  assign  statements  */ 
/*  for  Verilog  code  to  create  a  correct  truth  table.  */ 


/*********************************************************************/ 

#include  <stdio.h> 

#include  <stdlib.h> 

#include  <stdint.h> 

main  ( ) 

{ 

int  n=8;  //number  of  variables 
int  i,j;  //used  in  for  loop 

int  length=256;  //length  of  truth  table,  2An 
int  counter=0; 
int  typel,  type2; 

int  RSi [length];  //vector  to  store  number  of  ROTS 

//  Assign  all  values  in  RSi  vector  to  260 
for(i=0;  i<length;  i++)  RSi [i] =260; 

//  For  each  value  in  the  list,  if  the  RSi  value  is  260,  give  the  new 
ROTS  a  number  and  find  the  other  one 
for  (i=0;  i<length;  i++) { 
if (RSi [ i ] ==2  60)  { 

RSi [ i ] =counter ; 
for  (  j  =0 ;  j  <n ;  j++)  { 
typel  =  shift (i,  j,  n) ; 

RSi [typel ] =counter; 

type2  =  reverse (typel ,  n) ;  //Call  the  reverse  function  and 
return  the  reversed  value 

RSi [type2 ] =counter; 

} 

counter++;  //  If  a  new  number  has  been  assign  to  a  series 
of  ROTS,  increment  the  number 
} 
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} 

//  Print  the  assign  statements  to  use  in  Verilog  code 
for  (i=0;  i<length;  i++) { 

printf ( " \nassign  TT[%i]=  RSi[%i];",  i,  RSi[i]); 

} 

printf ("\n") ; 

} 

//  k  is  number  to  shift  (0  through  length),  m  is  number  of  times  to 
shift  (0  through  n-1),  n  is  number  of  bits  n 
int  shift (int  k,  int  m,  int  n) 

{ 

unsigned  int  lower,  upper,  new;  //8-bit  values 

lower=k«m; 

upper=lower»n  ; 

new=lower  |  upper;  //To  create  rotational  shift 
new=new« (32-n) ;  //  To  clear  number  in  bits  above  n 
new=new»  (32-n)  ; 

return (new) ;  //Return  the  rotation  shifted  number 

} 

//  k  is  number  to  reverse  (0  through  length),  n  is  number  of  bits  n 
int  reverse (int  k,  int  n) 

{ 

int  lower,  new,  i,  lsb,  b;  //8-bit  values 

lower=k; 

new=0 ; 

for(i=0;  i<n;  i++) { 
lsb=lower  &  0x01; 
lower=lower»l  ; 
b=lsb<< (n-  (i+1) ) ; 
new=new  |  b; 

} 

return (new) ;  //Return  the  dihedrally  shifted  number 

} 


B.3  C-CODE  TO  GENERATE  VERILOG  MODULES  FOR  NONLINEARITY 

This  code  works  for  even  n.  The  modules  min2,  min4  and  OC  are  the  same  for 

any  n  and  should  be  added  to  the  output  of  this  code.  The  code  can  be  modified  to 

produce  the  code  needed  if  only  enumerating  half  the  affine  functions  and  then  finding 

the  minimum  of  the  complement  of  each  affine  function  by  subtracting  the  count  from  2". 

The  lower  of  the  two  is  the  minimum  of  both  the  affine  functions  tested  and  its 

complement.  See  Appendix  A.  1  for  an  example. 

#define  n  6 
#define  N  64  //2An 
#define  M  N/4 
int  main  ()  { 
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intj; 
int  i; 
int  k; 

int  temp,  min; 
char  a; 

printf("  module  count(TT,  CLK,  count);\n"); 
printf("input  [%i:0]  TT;\n",  (N-l)); 
printf(" input  CLK;\n"); 
printf(" output  [%i:0]  count;",  n); 
printf("reg  [%i:0]  count;\n\n",  n); 
printf("reg  [%i:0]  cnt;\n\n",  n); 

a=0x4 1 ; 

for  (j=0;  j<(M/16);  j++){ 

if(a==0x59)  a=0x61; 

printf("reg  [4:0]  count%c,  count%c,  count%c,  count%c;\n",  a,  a+l,a+2, 

a+3); 

a=a+4; 

} 

printf("\n"); 

i=0; 

for  (j=0;  j<(M/16);  j++){ 
i=j*16; 

printf("wire  [2:0]  count%i,  count%i,  count%i,  count%i,  count%i,  count%i, 
count%i,  count%i,  count%i,  count%i,  count%i,  count%i,  count%i,  count%i,  count%i, 
count%i;\n",  i,  i+l,i+2,  i+3,  i+4,  i+5,  i+6,  i+7,  i+8,  i+9,  i+10,  i+11,  i+12,  i+13,  i+14, 
i+15); 

} 

printf("\n"); 

for  (j=0;  j<M;  j++){ 

printf("OC  o%i(TT[%i:%i],  count%i);\n",  j,  (j*4)+3,  (j*4),  j); 

} 

printf("\n"); 

printf("always@(posedge  CLK)\n  begin\n"); 
a=0x4 1 ; 

for(j=0;  j<(M/4);  j++){ 

if(a==0x59)  a=0x61; 
i=j*4; 

printf("  count%c  <=  count%i+  count%i+  count%i+  count%i;\n",  a, 

i,  i+1,  i+2,  i+3); 

a=a+l; 

} 

printf("  cnt  <=  "); 

a=0x4 1 ; 
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for  (j=0;  j<(M/16);  j++){ 

if(a==0x59)  a=0x61; 

if(j==(M/16-l))  printf("count%c+  count%c+  count%c+  count%c  ",  a,  a+1, 

a+2,  a+3); 

else  printf("count%c+  count%c+  count%c+  count%c+  ",  a,  a+1,  a+2, 

a+3); 

a=a+4; 

} 

printf(";\n"); 

printf("  if(cnt<=%i)  count=cnt;\n",  N/2); 

printf("  else  count=%i-cnt;\n",  N); 

printf("end\n  endmodule\n"); 

printf("  module  fit%in  (TT,  CLK,  fit);\n",  n); 
printf("input  [%i:0]  TT;\n",  (N-l)); 

printf("input  CLK/*  synthesis  syn_noclockbuf=l  syn_maxfan=  100000  *;\n"); 
printf(" output  [%i:0]  fit;  \n  wire  [%i:0]fit;\n",  n,  n); 
printf("wire  [%i:0]  afns  [%i:0];\n\n",  (N-l),  (N-l)); 

1=0; 

for  (j=0;  j<N/32;  j++)  { 

printf("reg  [%i:0]  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i, 
res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i, 
res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i,  res%i;\n",  (N-l),  i, 
i+l,i+2,  i+3,  i+4,  i+5,  i+6,  i+7,  i+8,  i+9,  i+10,  i+11,  i+12,  i+13,  1+ 14,  1+ 15,  i+16, 
i+ 1 7,i+ 18,  i+19,  i+20,  i+21,  i+22,  i+23,  i+24,  i+25,  i+26,  i+27,  i+28,  i+29,  i+3 0,  i+3 1 ); 
i=i+32; 

} 

printf("\n"); 

1=0; 

for  (j=0;  j<N/32;  j++)  { 

printf("wire  [%i:0]  counts%i,  counts%i,  counts%i,  counts%i,  counts%i, 
counts%i,  counts%i,  counts%i,  counts%i,  counts%i,  counts%i,  counts%i,  counts%i, 

counts%i,  counts%i,  counts%i,  counts%i,  counts%i,  counts%i,  counts%i,  counts%i, 

counts%i,  counts%i,  counts%i,  counts%i,  counts%i,  counts%i,  counts%i,  counts%i, 

counts%i,  counts%i,  counts%i;\n",  n,  i,  i+l,i+2,  i+3,  i+4,  i+5,  i+6,  i+7,  i+8,  i+9,  i+10, 

i+11,  i+12,  i+13,  i+14,  i+15,  i+16,  i+ 17, i+ 18,  i+19,  i+20,  i+21,  i+22,  i+23,  i+24,  i+25, 
i+26,  i+27,  i+28,  i+29,  i+3 0,  i+3 1 ); 
i=i+32; 

} 

printf("\n"); 

i=0; 

for  0=0;  j<(N/32);  j++){ _ 


105 


printf("wire  [%i:0]  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i, 

min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i, 

min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i, 

min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i, 

min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i,  min_l_%i;\n",  n,  i, 
i+l,i+2,  i+3,  i+4,  i+5,  i+6,  i+7,  i+8,  i+9,  i+10,  i+11,  i+12,  i+13,  i+14,  i+15,  i+16, 
i+17,i+18,  i+19,  i+20,  i+21,  i+22,  i+23,  i+24,  i+25,  i+26,  i+27,  i+28,  i+29,  i+3 0,  i+3 1 ); 
i=i+32; 

} 

printf("\n"); 

i=0; 

for  (j=0;  j<(N/32);  j++){ 

printf("wire  [%i:0]  min_2_%i,  min_2_%i,  min_2_%i,  min_2_%i, 

min_2_%i,  min_2_%i,  min_2_%i,  min_2_%i;\n",  n,  i,  i+l,i+2,  i+3,  i+4,  i+5,  i+6,  i+7); 
i=i+8; 

} 

printf("\n"); 

i=0; 

k=3; 

temp=N/32; 

while  (temp>0)  { 

for  (j=0;  j<temp;  j++){ 

printf("wire  [%i:0]  min_%i_%i,  min_%i_%i;\n",  n,  k,  i,  k,  i+1); 
i=i+2; 

} 

k++; 

i=0; 

temp=temp/4; 

printf("\n"); 

} 

//assign  afns  using  code  form  SRC-6  module  gen_affine 
printf("\n  *  *  *  Insert  affine  functions  here*  *  *\n\n"); 

for  (j=0;  j<N;  j++){ 

printf("count  c%i(res%i,  CLK,  counts%i);\n",  j,j,j); 

} 

printf("\n"); 

i=0; 

for  (j=0;  j<(N/4);  j++){ 

printf("min4  ml_%i(counts%i,  counts%i,  counts%i,  counts%i,  CLK, 
min_l_%i);\n",  j,  i,  i+1,  i+2,  i+3,  j); _ 
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printf("\n"); 


i=i; 

j=0; 

temp=N/16; 

while(temp>=l){ 

min=temp; 

for  (k=0;  k<min;  k++){ 

printf("min4  m%i_%i(min_%i_%i,  min_%i_%i,  min_%i_%i, 
min_%i_%i,  CLK,  min_%i_%i);\n",  i+1,  k,  i,  j,  i,  j+1,  i,  j+2,  i,  j+3,  i+1,  k); 
j-i'4: 

} 

temp=temp/4; 

i++; 

j=0; 

printf("\n"); 

} 

if(k==2) 

printf("min2  m%i_0(min_%i_0,  min_%i_l,  CLK,  fit);\n",  i+1,  i,  i); 
printf("\n"); 

printf("always@(posedge  CLK)\n  begin\n"); 
if(k==l) 

printf("fit<=min_%i_%i;\n",i,  k- 1); 

for  (j=0;  j<N;  j++){ 

printf("res%i  <=  TT  A  afns[%i];\n",j,j); 

} 

printf("end\n  endmodule\n"); 
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APPENDIX  C.  LISTS  OF  FUNCTIONS  OF  INTEREST 


C.l  ROTATIO  N  SYMMETRIC  FUNCTIONS  WITH  HIGHEST 
NONLINEARITY 

1.  Functions  on  Four  Variables  in  ANF  with  Nonlinearity  6 

/ (Xj,X2,X3,X4)  =  X2X4  ©  X4  ©  XtX3  ©  x3  ©  x2  ©  Xj 

fix  J,X2,X3,X4)  =  x3x4  ©x2x4  ©X[X4  ©  x4  ©x2x3  ©XjX3  ©  x3  ©X[X2  ©  x2  ©  Xj 

fix  j ,  x2 ,  x3 ,  x4 )  =  x2x4  ©  XjX3  (homogeneous) 

fix j ,  x2 ,  x3 ,  x4)  =  x3x4  ©  x,x4  ©  X[X4  ©  x2x3  ©  X[X3  ©  XjX2  (homogeneous) 

fix  j,X2,X3,X4)  =  X2X4  ©  X4  ©  XjX3  ©  X3  ©X2  ©Xj  ©  1 

/(x1;  x2,  x3,  x4)  =  x3x4  ©  x2x4  ©  XjX4  ©  x4  ©  x2x3  ©  X[X3  ©  x3  ©  X[X2  ©  x2  ©  Xj  ©  1 
fix  J ,  x2 ,  x3 ,  x4)  =  X2X4  ©  XjX3  ©  1 

fix  J ,  x2 ,  x3 ,  x4  )  =  x3x4  ©  X,X4  ©  XjX4  ©  x2x3  ©  XjX3  ©  XjX2  ©  1 


2.  Functions  on  Five  Variables  in  ANF  String  with  Nonlinearity  12 

(h)  indicates  a  homogeneous  function  of  degree  2 


0xl67c6eal 

0x01021049 

0xl31f57fe 

0x00140621 

0xl76a78c9 

0x05773f7e 

0x04752f37 

0xl30b51df 

0x01161668 (h) 

0xl31e56e9 

0x01031 15e 

0xl67d6fb6 


0x05763e69 
0xl76b79de 
0x00150736 
0x01 17177f 
0xl30a50c8 
0x04742e20 
0x04742e21 
0xl30a50c9 
0x01 17177e 
0x00150737 
0xl76b79df 
0x05763e68 


0xl67d6fb7 

0x01031 15f 

0xl31e56e8 

0x01161669 

0xl30b51de 

0x04752f36 

0x05773f7f 

0xl76a78c8 

0x00 140620(h) 

0xl31f57ff 

0x0 102 1048(h) 

0xl67c6ea0 
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3. 


Functions  on  Six  Variables  in  ANF  string  with  Nonlinearity  28 


(h)  indicates  a  homogeneous  function  of  degree  2 


0x0000001100050317 
0x00 1 0035404 1 e262 1 
0x00041 13402560e21 
0x0102041910254397 
0x01 12075cl43e66al 
0x0 1061 53c 12764eal 
0x01 13065al52c72c9 
0x0 107 143a 13645ac9 
0x0 10305 lfll3757ff 
0x001 10252050c3249 
0x00051 0320344 la49 
0x000101 1701 17177f 
0x0102041810244281 
0x0 1 06 1 53d 12774fb7 
0x01 12075dl43f67b7 
0x0000001000040201 
0x00041 13502570B7 
0x00 1 0035504  lf2737 
0x0005 10330345 lb5f 
0x001 10253050d335f 
0x0001011601161669 
0x0 107 143b 13655bdf 


0x01 13065bl52d73df 
0x0103051el 13656e9 
0x0103051el 13656e8 
0x01 13065bl52d73de 
0x0 107 143b 13655bde 
0x0001011601161668 (h) 
0x001 10253050d335e 
0x0005 10330345 lb5e 
0x0010035504112736 
0x00041 13502570B6 
0x0000001000040200 (h) 
0x01 12075dl43f67b6 
0x0106153dl2774fb6 
0x0102041810244280 
0x000101 1701 17177e 
0x00051 0320344 la48 
0x001 10252050c3248 
0x010305  lfll3757fe 
0x0107143al3645ac8 
0x01 13065al52c72c8 
0x0 1 06 1 53c 12764ea0 
0x01 12075c 143e66a0 
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0x0102041910254396 


0x00 1 0035404 1 e2620 


C.2 


group. 

listed 


0x0004 1 1 3402560e20  0x000000 1 1 000503 1 6 

ROTATION  SYMMETRIC  BALANCED  FUNCTIONS  OF  6  VARIABL  ES 
AND  HIGHEST  NONLINEARITY 

The  following  functions  have  nonlinearity  24,  the  highest  nonlinearity  for  this 
The  bent  functions  for  n=6  have  nonlinearity  28.  The  following  functions  are 
in  truth  table  form,  (h)  means  the  function  is  homogeneous. 


Degree2  :  0x5365f6c36fa6ca0(h) 
Degree  2  :  0x7b8b848bdl21dlde 
Degree  2  :  0x84747b742ede2e21 
Degree  2  :  0xfac9a093c905935f 
Degree  3  :  0xl07152fl3735dfe 
Degree  3  :  0x1 13074fl53b75fe 
Degree  3  :  0xl31fl6ea5768f8c8 
Degree  3  :  0xl72e5ca972elc996 
Degree  3  :  0xl73a4ec974a9el96 
Degree  3  :  0x6987952e93725ce8 
Degree  3  :  0x6993874e953a74e8 
Degree  3  :  0x6da2cd0db0b345b6 
Degree3:  0x7faedca8f2e0c880(h) 
Degree  3  :  0x7faedca8f2e0c880 
Degree3:  0x7fbacec8f4a8e080(h) 
Degree  3  :  0x80453 1370b571f7f 
Degree  3  :  0x805 123570dlf377f 


Degree  3  :  0x925d32f24f4cba49 
Degree  3  :  0x966c78bl6ac58bl7 
Degree  3  :  0x96786adl6c8da317 
Degree  3  :  0xe8c5bl368b561e69 
Degree  3  :  0xe8dla3568dle3669 
Degree  3  :  0xece0e915a8970737 
Degree  3  :  0xfeecf8b0eac48a01 
Degree  3  :  0xfef8ead0ec8ca201 
Degree  4  :  0xl51767077b3d7e 
Degree  4  :  0x1 17176el77a7ce8 
Degree  4  :  0xl30fl4ab5361d9de 
Degree  4  :  0xl31b06cb5529flde 
Degree  4  :  0xl21dl6e34769b95e 
Degree  4  :  0x5265d2d32f34db6 
Degree  4  :  0x5324f4d34bb65b6 
Degree  4  :  0x4345f6526fb2d36 
Degree  4  :  0xl63c5eel66e9a916 
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Degree  4  :  0xl73e5ee876e8e880 

Degree 

Degree  4  :  0x14325 lfl93757fe 

Degree 

Degree  4  :  0x4535370b771f7e 

Degree 

Degree  4  :  0xl47353elb765ee8 

Degree 

Degree  4  :  0x5127570d3f377e 

Degree 

Degree  4  :  0xl53275eld3e76e8 

Degree 

Degree  4  :  0x5537760f7e3e68 

Degree 

Degree  4  :  0xl34b249b5925d3de 

Degree 

Degree  4  :  0xl24d34b34b659b5e 

Degree 

Degree  4  :  0xl34f34ba5b64dac8 

Degree 

Degree  4  :  0xl25926d34d2db35e 

Degree 

Degree  4  :  0xl35b26da5d2cf2c8 

Degree 

Degree  4  :  0xl25d36f24f6cba48 

Degree 

Degree  4  :  0x5626dld38b747b6 

Degree 

Degree  4  :  0x4647d352af70f36 

Degree 

Degree  4  :  0x5667d3c3af64ea0 

Degree 

Degree  4  :  0x4706f552cb£2736 

Degree 

Degree  4  :  0x5726f5c3cbe66a0 

Degree 

Degree  4  :  0xl76a6c9978a5c396 

Degree 

Degree  4  :  0xl66c7cbl6ae58bl6 

Degree 

Degree  4  :  0xl76e7cb87ae4ca80 

Degree 

Degree  4  :  0xl6786edl6cada316 

Degree 

Degree  4  :  0xl77a6ed87cace280 

Degree 

Degree  4  :  0xl67c7ef06eecaa00 

Degree 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 


0x6983850f913355fe 

0x688595278373 ld7e 

0x689 18747853b357e 

0x68959766877a3c68 

0x7a8d94a3c361995e 

0x7b8f94aad360d8c8 

0x7a9986c3c529b 1 5e 

0x7b9b86cad528f0c8 

0x7a9d96e2c768b848 

0x6ca4dd25a2f30d36 
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